Hadoop flume tutorial

  • date 24th February, 2019 |
  • by Prwatech |
  • 0 Comments
 

Hadoop flume tutorial

  Hadoop flume tutorial, Welcome to the world of Hadoop flume Tutorials. In these Tutorials, one can explore how to fetch Flume Data from Twitter. Learn More advanced Tutorials on flume configuration in Hadoop from India’s Leading Hadoop Training institute which Provides Advanced Hadoop Course for those tech enthusiasts who wanted to explore the technology from scratch to advanced level like a Pro. We Prwatech, the Pioneers of Hadoop Training Offering advanced Certification course and Hadoop flume setup to those who are keen to explore the technology under the World-class Training Environment.  

Fetching Flume Data from Twitter

 
  1. Ubuntu v12 (or above)
  2. Apache flume 1.3.1 bin.tar
  3. Flume source 1.0. SNAPSHOT
 

Twitter data analysis using flume

Make a new directory in /usr/lib for flume $cd /usr/lib/ Hadoop flume tutorial $mkdir myflume Hadoop flume tutorial moving the apache-flume 1.3.1 bin.tar to /usr/lib/myflume $sudo mv /home/cloudera/Desktop/apache flume 1.3.1 bin.tar /usr/lib/myflume Untar the file. $sudo tar -zxvf apache flume 1.3.1 bin.tar Now we will have two files in /usr/lib/myflume apache flume 1.3.1 bin.tar.gz apache flume 1.3.1 bin   Hadoop flume tutorial   This apache "flume 1.3.1 bin" will have many directories one among them will be lib . move the "flume-source -1.0.SNAPSHOT.jar to this lib. $ sudo mv /home/cloudera/Desktop/flume source 1.0. SNAPSHOT   Hadoop flume tutorial   /usr/lib/myflume/apache-flume 1.3.1 bin /lib/   Hadoop flume tutorial   Go to the conf directory $ cd ../conf/ Hadoop flume tutorial   Create a copy of flume-env.sh.template as flume-env.sh in the same /conf/ dir. as : $ cp /usr/lib/myflume/apache flume 1.3.1 bin /conf/flume-env.sh.template/usr/lib/myflume/apache flume 1.3.1 bin /conf/flume-env.sh   Hadoop flume tutorial   Hence it will contain :   flume-env.sh.template 2 flume-env.sh   Hadoop flume tutorial   configuring the flume-env.sh as $ sudo gedit flume-env.sh   Hadoop flume tutorial   JAVA_HOME=/usr/lib/jvm/java-6-sun FLUME_CLASSPATH="/usr/lib/myflume/apache flume 1.3.1 bin/lib/flume-source -1.0.SNAPSHOT.jar"   Hadoop flume tutorial   CREATING API CREDENTIALS: app twitter --> twitter application management : sign in : username: password :   CREATE NEW API :   Application Details: * Name: * Description: * Website : * Callback URL: not require *Finally click on “yes I agree”   Hadoop flume tutorial   KEY AND ACCESS TOKENS :   Hadoop flume tutorial   #NOTE: Get the following information and fill it in the flume.txt file.
  • ConsemerKey
  • ConsumerSecret:
  • AccessTokens:
  • AccessTokenSecret:
  Now move the flume.txt file to the Cloudera create a new file in the /conf/ directory. $ cd /usr/lib/myflume/apache flume 1.3.1 bin/conf/ $sudo gedit flume.conf   Hadoop flume tutorial   copy the content of the flume.txt in this file and save it   Hadoop flume tutorial   Go to the bin directory and fire the final command:
  • $cd /usr/lib/myflume/apache flume 1.3.1 bin/bin/
  • $./flume-ng agent -n TwitterAgent -c conf -f /usr/lib/myflume/apache /conf/flume.conf
  Hadoop flume tutorial   Now we use our virtual machine web browser to see our records collected by user from Twitter.   Hadoop flume tutorial   Hadoop flume tutorial   Goto NameNode Status > user > flume > tweets   Hadoop flume tutorial   As you can see all the data collected from twitter is in json format which is needed to be converted in csv so that user can understand the data collected.   For this we can use online json to csv converter to convert the following data.    

Quick Support

image image