Sqoop – MySql to HDFS in Cloudera VM

  • date 19th March, 2021 |
  • by Prwatech |
  • 0 Comments

Sqoop - MySql to HDFS in Cloudera VM 

 

Sqoop is a powerful tool used to transfer data between Apache Hadoop and relational databases like MySQL. This facilitates efficient data integration by enabling seamless import and export of data between different environments. When using Sqoop to transfer data from MySQL to the Hadoop Distributed File System (HDFS) within a Cloudera VM (Virtual Machine), users can leverage Sqoop's capabilities to streamline the data ingestion process.

To perform this task, users must first ensure that Sqoop is installed and configured within the Cloudera VM environment. They then specify the necessary parameters such as the JDBC connection string, database credentials, and target HDFS directory in the Sqoop command. Sqoop automatically generates MapReduce jobs to parallelize the data transfer, optimizing performance and scalability.

 

Prerequisites

Hardware requirements:

Local machine

Ram 8 gb or above

software requirements:

VMware Workstation: https://prwatech.in/blog/software-installation/vmware-workstation-installation/

cloudera : https://www.cloudera.com/downloads/cdh.html

Programming language: Linux

Learn Linux: https://prwatech.in/blog/linux/linux-architecture/

STEPS

1.open cloudera in Vmware and open terminal

 

 

2.Login to mysql using following command: mysql -u root -p

3. Type show databases;

4.Create database  of your choice using command : create database ‘databasename’ and create a table by using following command create table Employee(firstName varchar(50), LastName varchar(50));

5.Type insert into Employee values("firstname ","last name "),("firstname ", "lastname ");

6. To display the values of table type 'select *from Employee;'

7. check the ipaddress of cloudera my using command: ifconfig

8. Type the following sqoop command:- "sqoop import –connect jdbc:mysql://192.168.43.68/prwatech --username root –password cloudera –table Employee –target-dir myfirstdata –m 1"

9. command: "hadoop fs -cat myfirstdata"

10. command : hadoop fs -ls myfirstdata

11. command : hadoop fs -cat myfirstdata/part-m-00000

 

Sqoop - MySql to HDFS in Cloudera VM

Quick Support

image image