Forum

This topic contains 2 replies, has 1 voice, and was last updated by  mamthakulal 2 years, 5 months ago.

Viewing 3 posts - 1 through 3 (of 3 total)
  • Author
    Posts
  • #1286 Reply

    mamthakulal
    Participant

    Question: Find the Maximum temperature

    Step1:

    temp = load ‘/data’ using PigStorage (‘\t’) as (year:int, t:int);
    dump temp;
    result:
    (2000,30)
    (2000,28)
    (2000,29)
    (2001,32)
    (2001,30)
    (2005,29)
    (2005,30)
    (2007,28)
    (2007,28)
    (2007,31)
    (2010,29)
    (2011,32)
    (2012,31)
    (2012,33)
    (2014,29)

    describe temp;
    result:
    temp: {year: int,t: int}

    Step 2:

    grpbyt = group temp by year;
    dump grpbyt;

    result:
    (2000,{(2000,30),(2000,28),(2000,29)})
    (2001,{(2001,32),(2001,30)})
    (2005,{(2005,29),(2005,30)})
    (2007,{(2007,28),(2007,28),(2007,31)})
    (2010,{(2010,29)})
    (2011,{(2011,32)})
    (2012,{(2012,31),(2012,33)})
    (2014,{(2014,29)})

    describe grpbyt;
    result:
    grpbyt: {group: int,temp: {year: int,t: int}}

    illustrate grpbyt;
    result:
    ———————————————
    | temp | year: bytearray | t: bytearray |
    ———————————————
    | | 2005 | 29 |
    | | 2005 | 30 |
    | | 2010 | 29 |
    ———————————————
    ———————————
    | temp | year: int | t: int |
    ———————————
    | | 2005 | 29 |
    | | 2005 | 30 |
    | | 2010 | 29 |
    ———————————
    ———————————————————–
    | grpbyt | group: int | temp: bag({year: int,t: int}) |
    ———————————————————–
    | | 2005 | {(2005, 29), (2005, 30)} |
    | | 2010 | {(2010, 29)} |
    ———————————————————–

    Step 3:
    maxt = foreach grpbyt generate group, MAX(temp.t);
    dump maxt;

    result:
    (2000,30)
    (2001,32)
    (2005,30)
    (2007,31)
    (2010,29)
    (2011,32)
    (2012,33)
    (2014,29)

    Step 4:
    stroe maxt into ‘/tempresults’;

    #1287 Reply

    mamthakulal
    Participant

    Q: Word Count

    Step 1:

    w = load ‘/words’ as (wd:chararray);
    dump w;
    result:
    (Hi how are you? I am Fine)
    (Where are you? I am at Prwatech class)
    (Are you learning Hadoop? Yes I am.)

    Step 2:
    w_split = foreach w generate FLATTEN(TOKENIZE(wd)) as wd;
    dump w_split;
    result:
    (Hi)
    (how)
    (are)
    (you?)
    (I)
    (am)
    (Fine)
    (Where)
    (are)
    (you?)
    (I)
    (am)
    (at)
    (Prwatech)
    (class)
    (Are)
    (you)
    (learning)
    (Hadoop?)
    (Yes)
    (I)
    (am.)

    describe w_split;
    w_split: {word: chararray}

    illustrate w_split;
    ———————————————-
    | w | wd: bytearray |
    ———————————————-
    | | Are you learning Hadoop? Yes I am. |
    ———————————————-
    ———————————————-
    | w | wd: chararray |
    ———————————————-
    | | Are you learning Hadoop? Yes I am. |
    ———————————————-
    ———————————
    | w_split | word: chararray |
    ———————————
    | | Are |
    | | you |
    | | learning |
    | | Hadoop? |
    | | Yes |
    | | I |
    | | am. |
    ———————————

    Step 3:

    wrdgrp = group w_split by word;
    dump wrdgrp;
    result:

    (I,{(I),(I),(I)})
    (Hi,{(Hi)})
    (am,{(am),(am)})
    (at,{(at)})
    (Are,{(Are)})
    (Yes,{(Yes)})
    (am.,{(am.)})
    (are,{(are),(are)})
    (how,{(how)})
    (you,{(you)})
    (Fine,{(Fine)})
    (you?,{(you?),(you?)})
    (Where,{(Where)})
    (class,{(class)})
    (Hadoop?,{(Hadoop?)})
    (Prwatech,{(Prwatech)})
    (learning,{(learning)})

    describe wrdgrp;
    wrdgrp: {group: chararray,w_split: {word: chararray}}

    illustrate wrdgrp;
    ————————————————-
    | w | wd: chararray |
    ————————————————-
    | | Hi how are you? I am Fine |
    | | Where are you? I am at Prwatech class |
    | | Are you learning Hadoop? Yes I am. |
    ————————————————-
    ———————————
    | w_split | word: chararray |
    ———————————
    | | Hi |
    | | how |
    | | are |
    | | you? |
    | | I |
    | | am |
    | | Fine |
    | | Where |
    | | are |
    | | you? |
    | | I |
    | | am |
    | | at |
    | | Prwatech |
    | | class |
    | | Are |
    | | you |
    | | learning |
    | | Hadoop? |
    | | Yes |
    | | I |
    | | am. |
    ———————————
    ——————————————————————-
    | wrdgrp | group: chararray | w_split: bag({word: chararray}) |
    ——————————————————————-
    | | Are | {(Are)} |
    | | Fine | {(Fine)} |
    | | Hadoop? | {(Hadoop?)} |
    | | Hi | {(Hi)} |
    | | I | {(I), (I), (I)} |
    | | Prwatech | {(Prwatech)} |
    | | Where | {(Where)} |
    | | Yes | {(Yes)} |
    | | am | {(am), (am)} |
    | | am. | {(am.)} |
    | | are | {(are), (are)} |
    | | at | {(at)} |
    | | class | {(class)} |
    | | how | {(how)} |
    | | learning | {(learning)} |
    | | you | {(you)} |
    | | you? | {(you?), (you?)} |
    ——————————————————————-

    Step 4:

    wrdcount = foreach wrdgrp generate group, COUNT(w_split);

    dump wrdcount;
    result:

    (I,3)
    (Hi,1)
    (am,2)
    (at,1)
    (Are,1)
    (Yes,1)
    (am.,1)
    (are,2)
    (how,1)
    (you,1)
    (Fine,1)
    (you?,2)
    (Where,1)
    (class,1)
    (Hadoop?,1)
    (Prwatech,1)
    (learning,1)

    Step 5:
    store wrdcount into ‘/wordcount_pig’;

    #1289 Reply

    mamthakulal
    Participant

    Q: Word Size

    Step 1:

    w = load ‘/words’ as (wd:chararray);
    dump w;
    (Hi how are you? I am Fine)
    (Where are you? I am at Prwatech class)
    (Are you learning Hadoop? Yes I am.)

    Step 2:

    wrdgrp = group w by SIZE(wD);
    dump wrdgrp;

    result:
    (25,{(Hi how are you? I am Fine)})
    (34,{(Are you learning Hadoop? Yes I am.)})
    (37,{(Where are you? I am at Prwatech class)})

    Step 3:
    wrdsizec = foreach wrdgrp generate group, COUNT(w);
    dump wrdsizec;
    result:
    (25,1)
    (34,1)
    (37,1)

    Step 4:
    store wrdsizec into ‘/Wordsizecount_pig’

Viewing 3 posts - 1 through 3 (of 3 total)
Reply To: PIG – Assignment
Your information:




cf22

Your Name (required)

Your Email (required)

Subject

Phone No

Your Message

Cart

  • No products in the cart.