How to Run a Pig Script in HDFS Mode
How to Run a Pig Script in HDFS Mode

Pig scriptsΒ are used to execute a set of Apache Pig commands collectively. This helps in reducing the time and effort invested in writing and executing each command manually while doing the Pig programming.
Apache Pig script is a step by step guide to help you create your first Apache Pig script.
An Apache Pig script works in two modes:
Local Mode:Β In βlocal modeβ, you can execute the pig script in local file system. In this case you donβt need to store the data in Hadoop HDFS file system, instead you can work with the data stored in local file system itself.
HDFS Mode:Β In βHDFS modeβ, the data needs to be stored in HDFS file system and you can process the data with the help of pig script.
Pig Script in HDFS Mode:
Step1: Writing a script
Open an editor (e.g. gedit) in your Cloudera Demo VM environment:
Command: gedit pigsample.pig
Step 2:Β Create a Input File with some data. Here I created file name data.txt with some content
Step 3:
Load : β Here Iβm using load command for load the data or file.
A = LOAD β/data.txtβ using Pig Storage (β,β) as (fname: chararray, lname:chararray, city:chararray, profession:chararray);
Pig Storage (β,β):- Helps to store an input file. My input file is data.txt .my data.txt I used delimiter as.
FOREACH β¦Generate: is used for a select a particular columns.
B = FOREACH A generate fname, mobile no, profession;
DUMP B:
Is used for a display for show the output.
Step 4: copying file from local file system to hadoop(hdfs).
Step 5: Check whether the file is copied or not, the file is copied.
Step 5: Run the pig Script using this command .