Hadoop 2.0 Installation on Ubuntu-part-2

Source the .bashrc file to set the hadoop environment variables without having to invoke a new shell:
After that Type :$ source .bashrc

Setup the Hadoop Cluster

This section describes the detail steps needed for setting up the Hadoop Cluster and configuring the core Hadoop configuration files.
Configure JAVA_HOME
Configure JAVA_HOME in ‘hadoop-env.sh’. This file specifies environment variables that affect the JDK used by Apache Hadoop 2.2.0 daemons started by the Hadoop start-up scripts:
Now you should be in hadoop-2.2.0/etc/hadoop/ directory.
$gedit hadoop-env.sh
Update the JAVA_HOME to:
export JAVA_HOME=/usr/lib/jvm/java-6-openjdk-i386


Create NameNode and DataNode directory
Create DataNode and NameNode directories to store HDFS data.
$mkdir -p $HADOOP_HOME/hadoop2_data/hdfs/namenode
$mkdir -p $HADOOP_HOME/hadoop2_data/hdfs/datanode

Configure the Default File system
The ’core-site.xml’ file contains the configuration settings for Apache Hadoop Core such as I/O settings that are common to HDFS, YARN and MapReduce. Configure default files-system (Parameter: fs.default.name) used by clients in core-site.xml
$gedit core-site.xml
Set Hadoop environment Variables
name fs.default.name name
value hdfs://localhost:9000 value
Set Hadoop environment Variables


Where hostname and port are the machine and port on which Name Node daemon runs and listens. It also informs the Name Node as to which IP and port it should bind. The commonly used port is 9000 and you can also specify IP address rather than hostname.

Configure the HDFS

This file contains the cconfiguration settings for HDFS daemons; the Name Node and the data nodes.
Configure hdfs-site.xml and specify default block replication, and NameNode and DataNode directories for HDFS. The actual number of replications can be specified when the file is created. The default is used if replication is not specified in create time.

$gedit hdfs-site.xml
Set Hadoop environment Variables
1 dfs.permissions
false dfs.namenode.name.dir
/home/user/hadoop-2.2.0/hadoop2_data/hdfs/namenode dfs.datanode.data.dir
Set Hadoop environment Variables

Configure YARN framework

This file contains the configuration settings for YARN; the NodeManager.
$gedit yarn-site.xml
Set Hadoop environment Variables
mapreduce_shuffle yarn.nodemanager.aux-services.mapreduce.shuffle.class
Set Hadoop environment Variables

$gedit yarn-site.xml

Configure MapReduce framework

This file contains the configuration settings for MapReduce. Configure mapred-site.xml and specify framework details.
$cp mapred-site.xml.template mapred-site.xml

$gedit mapred-site.xml
Set Hadoop environment Variables
Set Hadoop environment Variables

Edit /etc/hosts file

Give ifconfig in the terminal and note down the ip address. Then put this ip address in /etc/hosts file as mentioned in below snapshots, save the file and then close it.

$sudo gedit /etc/hosts

The ip address in this file, localhost and ubuntu are separated by tab.

Creating ssh
$ssh-keygen -t rsa -P “”

Moving the key to authorized key:

$cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys

Restart the system

Start the DFS services

The first step in starting up your Hadoop installation is formatting the Hadoop file-system, which is implemented on top of the local file-systems of your cluster. This is required on the first time Hadoop installation. Do not format a running Hadoop file-system, this will cause all your data to be erased.
To format the file-system, run the command:
$hadoop namenode –format

You are now all set to start the HDFS services i.e. Name Node, Resource Manager, Node Manager and Data Nodes on your Apache Hadoop Cluster.
$cd hadoop-2.2.0/sbin/
$./hadoop-daemon.sh start namenode
$./hadoop-daemon.sh start datanode

Start the YARN daemons i.e. Resource Manager and Node Manager. Cross check the service start-up using JPS (Java Process Monitoring Tool).
$./yarn-daemon.sh start resourcemanager
$./yarn-daemon.sh start nodemanager

Start the History server:
$./mr-jobhistory-daemon.sh start historyserver