#3194

MD SAJID AKHTAR
Participant

How to run PIG command from Hue?

First,we have to open web UI ,there we have to click on Hue button.Then login with cloudera credentials.There we find one menu “Query Editor” . From there we can choose Pig Editor.Ther we can write , save, and run the commands from the editor.

Limitation Of Pig

1> Pig cannot deal with poor design of xmlor JSON and flexible schemas.
2> It ha s a problem dealing with unstructured data like images,videos,audios,etc.
3> Pig is built on top of Map Readuce, which is batch-oriented.

How to achieve Performance Tuning in Pig?

i> Use Optimization:
Pig supports various optimization rules which are turned on by default.

ii>.Use Types:
If types are not specified in the load statement, Pig assumes the type of double for numeric computations.

iii> Reduce Your Operator Pipeline:
we can split our projects into several steps for the clarity of our script.

iv> Filter Early and Often:
For early result,in most cases it is beneficial to apply filters to reduce the amount of data flowing through the pipeline.

Piggy Bank and its application

PIGGY BANK:-
-> User Defined Pig Function
-> Place for Pig User to share their functions.
-> Can edit the function and contribute that function to the piggy bank.
-> Piggy Bank is a Pig’s repository of user-contributed functions.They are distributed as a part of Pig distribution.
-> We need to register the piggy bank jar to use it. We can define that jar at contrib/piggybank/java/piggybank.jar

Prwatech