Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> NEED HELP in Hive Query


Copy link to this message
-
Re: NEED HELP in Hive Query
B = group A by ( name, date, url);
-- B now has 2 fields: "group" which is a tuple of (name, date, url)
and "A" which is a collection of tuples from A with the same
name-date-url
-- try "illustrate B" or "describe B" to see what that looks like

counts = foreach B generate flatten(group) as (name, date, url),
COUNT_STAR(A) as num_entries;

Dmitriy

On Sun, Oct 14, 2012 at 10:57 AM, yogesh dhari <[EMAIL PROTECTED]> wrote:
>
> Thanks Chyikwei :-)
>
> I got it now :-), Is there be another method without using flatten(A.name) and so on ?
>
> A = load '/File/000000_0' using PigStorage('\u0001')
>
>
>        as (name, date, url, hit:INT);
>
>
>
>
>
> B = group A by ( name, date, url);
>
>
>
>
>
>  C = foreach B generate flatten(A.name), flatten(A.date), flatten(A.url), SUM(A.hit) ;
>
>
>
>
>
> D = distinct C;
>
>
>
>
>
> Dump D;
>
> Thanks & Regards
> Yogesh Kumar Dhari
>
>> Date: Sun, 14 Oct 2012 13:24:27 -0400
>> Subject: Re: NEED HELP in Hive Query
>> From: [EMAIL PROTECTED]
>> To: [EMAIL PROTECTED]
>>
>> Hi yogesh,
>>
>> Thes result of "group by" should look like:
>> {group: (group keys),  { (instance1) , (instance2)  } }
>>
>> For example:
>>
>> If A looks like:
>> A: {name: chararray,age: int,gpa: float}
>>
>> And after  "B = GROUP A BY age;"
>>
>> B will become:
>> B: {group: int, A: {name: chararray,age: int,gpa: float}}
>>
>> Then you can use
>> FOREACH B Generate.....
>> To get the result you want.
>>
>> If my explaination is not clear, just take a look at
>> http://pig.apache.org/docs/r0.10.0/basic.html#GROUP
>>
>> Hope this help.
>>
>> Best,
>> Chyi-Kwei
>>
>> On Sun, Oct 14, 2012 at 1:03 PM, yogesh dhari <[EMAIL PROTECTED]> wrote:
>> >
>> > Hi CHyi-kwei,
>> >
>> > Thanks for help, I think I wasn't able to clarify my question
>> >
>> > The query you wrote
>> >
>> > It will count the number of occurrence of same NAME, DATE and URL but won't add all hitcount under same name, date, url.
>> >
>> > I want result like this
>> >
>> > like :  timesascent.in,     2008-08-27,      http://timesascent.in/      (/*addition of
>> > all hitcount under same name, date, url    (37+17+17+27+....)*/  98 )
>> >           timesascent.in,       2008-08-27,       http://timesascent.in/section/2/Interviews    (/*addition of
>> > all hitcount under same name, date, url    (15+14)*/  29)
>> >           .
>> >           .
>> >           .
>> >
>> > From this file below
>> >
>> >       NAME                                 DATE                               URL                                                                  HITCOUNT
>> > timesascent.in    2008-08-27    http://timesascent.in/index.aspx?page=tparchives    15
>> > timesascent.in    2008-08-27
>> > http://timesascent.in/index.aspx?page=article§id=1&contentid=200812182008121814134447219270b26
>> >     20
>> > timesascent.in    2008-08-27    http://timesascent.in/    37
>> > timesascent.in    2008-08-27    http://timesascent.in/section/39/Job%20Wise    14
>> > timesascent.in    2008-08-27
>> > http://timesascent.in/article/7/2011062120110621171709769aacc537/Work-environment--Employee-productivity.html
>> >     20
>> > timesascent.in    2008-08-27    http://timesascent.in/    17
>> > timesascent.in    2008-08-27    http://timesascent.in/section/2/Interviews    15
>> > timesascent.in    2008-08-27    http://timesascent.in/    17
>> > timesascent.in    2008-08-27    http://timesascent.in/    27
>> > timesascent.in    2008-08-27    http://timesascent.in/    37
>> > timesascent.in    2008-08-27    http://timesascent.in/    27
>> > timesascent.in    2008-08-27    http://www.timesascent.in/    16
>> > timesascent.in    2008-08-27    http://timesascent.in/section/2/Interviews    14
>> > timesascent.in    2008-08-27    http://timesascent.in/    14
>> > timesascent.in    2008-08-27    http://timesascent.in/    22
>> >
>> >
>> > Please help and suggest how to write query for this in HIVE and  PIG
>> >
>> > Thanks & Regards
>> > Yogesh Kumar Dhari
>> >
>> >> Date: Sun, 14 Oct 2012 11:31:00 -0400