Forum

This topic contains 18 replies, has 7 voices, and was last updated by  arpita21 4 months, 2 weeks ago.

Viewing 15 posts - 1 through 15 (of 19 total)
  • Author
    Posts
  • #3163 Reply

    Sanchita Sen
    Participant

    Limitation of pig
    1.Low latency queries are not supported in pig
    2. Pig does not support random read or write queries
    3. Pig works well for only batch processing

    #3173 Reply

    Debarati chatterjee

    Run PIG command from hue
    ans: we open HUE web UI and login with cloudera credentials there we can find one menu “query editor” from where we can choose pig editor then we can write
    ,run and save the commands from the editor

    #3174 Reply

    apuaparna
    Participant

    What are the limitations of pig?
    Ans.1. When something goes wrong, it just gives execution error in udf, it does not show the type of error like syntax error or type error, logical error.
    Atleast a developer should get the different types of error when developer has a syntax error.
    2.The commands are not executed unless either you dump or store an intermediate or final result. This increases the iteration between debug and resolving the issue.
    3. low latncy query are not supportable in pig. this is not suitable for oltp and olap.
    4. If we want to do random writes to update small portion of data, we can not use pig.

    #3175 Reply

    Debarati chatterjee

    When we run PIG in local mode,will it convert the query in MR or not
    Ans:
    Yes, it will convert into MR as it is written in java only so it will generate byte code

    #3176 Reply

    sayanti
    Participant

    Limitation of Pig:

    1)Code efficiency is relatively less with MR.
    2)Pig is built on top of MapReduce, which is batch oriented.
    3)In batch processing its work well.
    4)Low latency queries are not supported in pig.

    #3178 Reply

    Debarati chatterjee

    3) How Physical translator works at the time of compilation of pig query
    Ans:
    After the logical plan is generated, the script execution moves to the physical plan where there is a description about the physical operators, Apache Pig will use, to execute the Pig script. A physical plan is more or less like a series of Map Reduce jobs but then the plan does not have any reference on how it will be executed in Map Reduce. During the creation of physical plan, co group logical operator is converted into 3 physical operators namely –Local Rearrange, Global Rearrange and Package. Load and store functions usually get resolved in the physical plan

    #3179 Reply

    Sanchita Sen
    Participant

    How to achieve performance tuning in PIG ?
    Performance tuning in PIG can be achieved in PIG with the following:
    1.We can use Optimization
    2. We can use Types. If types are not specified in the load statement, Pig assumes the type of =double= for numeric computations.
    3. Pig determine when a field is no longer needed and drop the field from the row.
    4.It is fruitful to use filters to reduce the amount of data through pipeline
    5.It is better to reduce the operator pipelinbe

    #3180 Reply

    arpita21
    Participant

    When we run Pig in local mode, will it convert the query in MR or not ?

    No, Because in local mode it will take data from LFS. For MapReduce it is mandatory to use hadoop and file should be stored in HDFS.

    #3181 Reply

    arpita21
    Participant

    How the physical translator works at the time of compilation of pig query?

    Pig undergoes some steps when a pig latin script is converted into MapReduce jobs. After performing the basic parsing and semantic checking, it produces a logical plan. The logical plan describes the logical operators that have to be executed by pig during execution. After this, pig produces a physical plan. The physical plan describes the physical operators that are needed to execute the script.

    #3182 Reply

    Debarati chatterjee

    Compilation Stage
    * Optimized Logical Plan?
    * Physical Plan?
    Ans:
    Logical and Physical plans are created during the execution of a pig script. Pig scripts are based on interpreter checking. Logical plan is produced after semantic checking and basic parsing and no data processing takes place during the creation of a logical plan. For each line in the Pig script, syntax check is performed for operators and a logical plan is created. Whenever an error is encountered within the script, an exception is thrown and the program execution ends, else for each statement in the script has its own logical plan.A logical plan contains collection of operators in the script but does not contain the edges between the operators.
    After the logical plan is generated, the script execution moves to the physical plan where there is a description about the physical operators, Apache Pig will use, to execute the Pig script. A physical plan is more or less like a series of MapReduce jobs but then the plan does not have any reference on how it will be executed in MapReduce. During the creation of physical plan, cogroup logical operator is converted into 3 physical operators namely –Local Rearrange, Global Rearrange and Package. Load and store functions usually get resolved in the physical plan.

    #3184 Reply

    Debarati chatterjee

    How to achieve perfomance tuning in PIG?
    Ans:
    a.Use Optimization
    Pig supports various optimization rules which are turned on by default. Become familiar with these rules.
    b.Use Types
    If types are not specified in the load statement, Pig assumes the type of =double= for numeric computations. A lot of the time, your data would be much smaller, maybe, integer or long. Specifying the real type will help with speed of arithmetic computation. It has an additional advantage of early error detection.
    c.Project Early and Often
    Pig does not (yet) determine when a field is no longer needed and drop the field from the row. For example, say you have a query like:
    d.Filter Early and Often
    As with early projection, in most cases it is beneficial to apply filters as early as possible to reduce the amount of data flowing through the pipeline.
    e.Reduce Your Operator Pipeline
    For clarity of your script, you might choose to split your projects into several steps for instance:
    f.Make Your UDFs Algebraic
    Queries that can take advantage of the combiner generally ran much faster (sometimes several times faster) than the versions that don’t. The latest code significantly improves combiner usage; however, you need to make sure you do your part. If you have a UDF that works on grouped data and is, by nature, algebraic (meaning their computation can be decomposed into multiple steps) make sure you implement it as such. For details on how to write algebraic UDFs, see Algebraic Interface.

    #3185 Reply

    Debarati chatterjee

    How to implement MapSide Join or Reduce SideJoin in PIG?
    Ans
    Map-side join: n a map-side (fragment-replicate) join, you hold one dataset in memory (and join on the other dataset, record-by-record. In this type of join the large relation is followed by one or more small relations. The small relations must be small enough to fit into main memory; if they don’t, the process fails and an error is generated.

    #3186 Reply

    Sanchita Sen
    Participant

    PiggyBank and application
    Piggy Bank is a place for Pig users to share the Java UDFs they have written for use.The function ia written as “as-is”.If anyone find bug in function,take time fix it and contribute the fix to Piggy Bank.

    #3187 Reply

    apuaparna
    Participant

    Run Pig Command From Hue.
    Ans. First open HUE web UI and login with cloudera credentials there we can find one menu “Query Editor” From there we can choose pig editor then we can write the command and then run the commands from the editor. we can also save the command and result.

    #3188 Reply

    sayanti
    Participant

    When we run Pig in local mode, will it convert the query in MR or not?

    -> Yes, it will convert into MR as it is written in java only so it will generate byte code.

Viewing 15 posts - 1 through 15 (of 19 total)
Reply To: pig
Your information:




cf22

Your Name (required)

Your Email (required)

Subject

Phone No

Your Message

Cart

  • No products in the cart.