Prerequisites
Hardware requirements:
Local machine
RAM 8 gb or above
Software requirements:
VMware Workstation , Version : 16 pro
Download the software from given link :- https://prwatech.in/blog/software-installation/vmware-workstation-installation/
Ubuntu , Version:18.04 :- https://prwatech.in/blog/software-installation/vmware-workstation-installation/
Programming languages: Linux
Learn Linux : https://prwatech.in/blog/linux/linux-architecture/
1.Install java
Command: $sudo apt-get install openjdk-8-jdk
data:image/s3,"s3://crabby-images/6dc3b/6dc3be571c25a40391e6af8339a3f0df4b15df9c" alt=""
2. check the java version whether it is installed
data:image/s3,"s3://crabby-images/3cf3c/3cf3c57513a0bb8f5b2125e324875aaa329ea9af" alt=""
3. Download spark in terminal by using following command :
wget https://archive.apache.org/dist/spark/spark-2.4.0/spark-2.4.0-bin-hadoop2.7.tgz
data:image/s3,"s3://crabby-images/dfc4c/dfc4c32edb6ef91f215c300b2c8042c29fd6e691" alt=""
4. Untar the file using following command :
tar -xvf spark-2.4.0-bin-hadoop2.7.tgz
data:image/s3,"s3://crabby-images/130eb/130ebea1b068ae6a04148506b71b289ceda77420" alt=""
5. Edit bashrc file
data:image/s3,"s3://crabby-images/624c6/624c6da3cc847f0e925489f99a2e3b652d1f46d0" alt=""
6. Add the following lines in bashrc file
export SPARK_HOME=/home/user/spark-2.4.0-bin-hadoop2.7
export PATH=$PATH:$SPARK_HOME/bin
data:image/s3,"s3://crabby-images/d9973/d9973bd0b69d5597706d59a327512efcf90cd8c1" alt=""
7. Save the bashrc file
data:image/s3,"s3://crabby-images/7aa41/7aa41ab339ea772af811d6118be6dad434c1f2a5" alt=""
8. Run spark-shell from the bin directory of spark
data:image/s3,"s3://crabby-images/1e371/1e371266374cd5de07b3adcbc5b6c187a1ad1038" alt=""
9. Spark installation is completed now you can write query
data:image/s3,"s3://crabby-images/8d550/8d55027ca0ffb4554fba4990542a72d65cbd71ec" alt=""