Install Hadoop 3.2.1 on Windows 10 Step by Step Guide
Prerequisites:
Hardware requirements:-
RAM: 8 GB or above
Software requirements:-
vmware workstation - https://www.vmware.com/in/products/workstation-pro/workstation-pro-evaluation.html
ubuntu 10.x or above - https://ubuntu.com/download/desktop
Programming languages: Linux
Learn Linux : https://prwatech.in/blog/linux/linux-architecture/
1.Install ubuntu in virtual machine

2. Power on this virtual machine

3. Open the Terminal
Select As marked in the diagram or follow the arrow

You will see this

4. Check your hostname is ubuntu

5. Set up a single node hadoop cluster

6. Create a group called hadoop

7. Create an user called hduser
root@ubuntu:/home/user# sudo adduser hduser

It will ask password two times followed by some details, press enter and yes.
Give password which you can remember or else enter default password as “password”
8. Add hduser to hadoop group
# sudo adduser hduser hadoop

9. Add the ‘hduser’ to ‘sudoers’ so that hduser can do some admin task
#sudo visudo

It will open another file where you have to enter the below command:
Save -> Ctrl+s and then Exit -> ctrl+x

10. Logout of your system and login as a hduser Add hduser to hadoop group



Password: password you have set earlier
11. Open the Terminal

12. Configure ssh
#sudo apt-get install openssh-server
“Enter password and Y to continue”

13. Generate ‘SSH’ for communication
#ssh-keygen
Press ‘enter’ whenever it asks.

14. Copy public key to Authorized key file & edit the permission

15. Give permission of the Authorized key

16. Start ‘SSH’

17. Test your ‘SSH’ connectivity

18. Disable IPV6

Enter “i” for --INSERT—mode and pres “enter” at the end of last line.
And write the following lines at bottom after # disable ipv6:
"net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1"
To exit press “esc” then “wq” then “enter”

19. Check if ipv6 is disabled
If it gives ‘1’ as output then its disabled

20. Now install hadoop 3.2.1 from the following website:
https://archive.apache.org/dist/hadoop/common/hadoop-3.2.1/
Download hadoop 3.2.1.tar.gz and save it to hduser/desktop
21. Move the above file to /usr/local/

22. Then go to local directory
#cd /usr/local
and then ls

23. Now untar(unzip) the file using the command: $ sudo tar –xvf hadoop-3.2.1.tar.gz

24. Remove the tar file now

25. Now lets create a shortcut name of hadoop-3.2.1 to hadoop

26. Check the file hadoop by entering ‘ls’ and follow below commands

27. Change the ownership of hadoop-3.2.1 to hduser group

28. Now give all the permissions to hadoop-3.2.1 folder

29. Edit hadoop-env.sh file

Go to bottom of the file press ‘i’ And add those lines at end which are under red circle

#export JAVA_HOME={JAVA_HOME}
30. Update HOME ~/.bashrc

Now go to the bottom of the file press ‘i’ and then enter the lines


31. Update yarn-site.xml

# Enter the lines under <configuration>

32. Update core-site.xml

Enter the following lines under <configuration> files

33.Create the above temp folder and give appropriate permission using following commands

34. update mapred-site.xml

Enter the following lines under <configuration> files

35. Create a temporary directory which will be used as base location for DFSAlso create the directory and set the required ownership and permissions by adding following three lines

36. Update hdfs-site.xml

Enter the following lines under <configuration> files

37. Format Namenode
Close the terminal and run the command in the new terminal


38. Start your single-node cluster

39. Type “jps”

If you get the above nodes after writing ‘jps’ it means you have successfully installed hadoop-3.2.1 on single node cluster.