Hadoop flume tutorial

  • date 24th February, 2019 |
  • by Prwatech |


Hadoop flume tutorial


Hadoop flume tutorial, Welcome to the world of Hadoop flume Tutorials. In these Tutorials, one can explore how to fetch Flume Data from Twitter. Learn More advanced Tutorials on flume configuration in Hadoop from India’s Leading Hadoop Training institute which Provides Advanced Hadoop Course for those tech enthusiasts who wanted to explore the technology from scratch to advanced level like a Pro.

We Prwatech, the Pioneers of Hadoop Training Offering advanced Certification course and Hadoop flume setup to those who are keen to explore the technology under the World-class Training Environment.


Fetching Flume Data from Twitter


  1. Ubuntu v12 (or above)
  2. Apache flume 1.3.1 bin.tar
  3. Flume source 1.0. SNAPSHOT


Twitter data analysis using flume

Make a new directory in /usr/lib for flume

$cd /usr/lib/

Hadoop flume tutorial

$mkdir myflume

Hadoop flume tutorial

moving the apache-flume 1.3.1 bin.tar to /usr/lib/myflume

$sudo mv /home/cloudera/Desktop/apache flume 1.3.1 bin.tar /usr/lib/myflume

Untar the file.

$sudo tar -zxvf apache flume 1.3.1 bin.tar

Now we will have two files in /usr/lib/myflume

apache flume 1.3.1 bin.tar.gz

apache flume 1.3.1 bin


Hadoop flume tutorial


This apache “flume 1.3.1 bin” will have many directories one among them will be lib . move the “flume-source -1.0.SNAPSHOT.jar to this lib. $ sudo mv /home/cloudera/Desktop/flume source 1.0. SNAPSHOT


Hadoop flume tutorial


/usr/lib/myflume/apache-flume 1.3.1 bin /lib/


Hadoop flume tutorial


Go to the conf directory

$ cd ../conf/

Hadoop flume tutorial


Create a copy of flume-env.sh.template as flume-env.sh in the same /conf/ dir. as :

$ cp /usr/lib/myflume/apache flume 1.3.1 bin /conf/flume-env.sh.template/usr/lib/myflume/apache flume 1.3.1 bin /conf/flume-env.sh


Hadoop flume tutorial


Hence it will contain :


2 flume-env.sh


Hadoop flume tutorial


configuring the flume-env.sh as

$ sudo gedit flume-env.sh


Hadoop flume tutorial



FLUME_CLASSPATH=”/usr/lib/myflume/apache flume 1.3.1 bin/lib/flume-source -1.0.SNAPSHOT.jar”


Hadoop flume tutorial



app twitter –> twitter application management :
sign in : username:
password :




Application Details:

* Name:

* Description:

* Website :

* Callback URL: not require

*Finally click on “yes I agree”


Hadoop flume tutorial




Hadoop flume tutorial


#NOTE: Get the following information and fill it in the flume.txt file.

  • ConsemerKey
  • ConsumerSecret:
  • AccessTokens:
  • AccessTokenSecret:


Now move the flume.txt file to the Cloudera

create a new file in the /conf/ directory.

$ cd /usr/lib/myflume/apache flume 1.3.1 bin/conf/

$sudo gedit flume.conf


Hadoop flume tutorial


copy the content of the flume.txt in this file and save it


Hadoop flume tutorial


Go to the bin directory and fire the final command:

  • $cd /usr/lib/myflume/apache flume 1.3.1 bin/bin/
  • $./flume-ng agent -n TwitterAgent -c conf -f /usr/lib/myflume/apache /conf/flume.conf


Hadoop flume tutorial


Now we use our virtual machine web browser to see our records collected by user from Twitter.


Hadoop flume tutorial


Hadoop flume tutorial


Goto NameNode Status > user > flume > tweets


Hadoop flume tutorial


As you can see all the data collected from twitter is in json format which is needed to be converted in csv so that user can understand the data collected.


For this we can use online json to csv converter to convert the following data.




Quick Support

image image