Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> NEED HELP in Hive Query


+
yogesh dhari 2012-10-14, 14:54
+
chyi-kwei yau 2012-10-14, 15:31
+
yogesh dhari 2012-10-14, 17:03
+
chyi-kwei yau 2012-10-14, 17:24
Copy link to this message
-
RE: NEED HELP in Hive Query

Thanks Chyikwei :-)

I got it now :-), Is there be another method without using flatten(A.name) and so on ?
 
A = load '/File/000000_0' using PigStorage('\u0001')  
       as (name, date, url, hit:INT);

B = group A by ( name, date, url);  

 C = foreach B generate flatten(A.name), flatten(A.date), flatten(A.url), SUM(A.hit) ;

D = distinct C;

Dump D;

Thanks & Regards
Yogesh Kumar Dhari

> Date: Sun, 14 Oct 2012 13:24:27 -0400
> Subject: Re: NEED HELP in Hive Query
> From: [EMAIL PROTECTED]
> To: [EMAIL PROTECTED]
>
> Hi yogesh,
>
> Thes result of "group by" should look like:
> {group: (group keys),  { (instance1) , (instance2)  } }
>
> For example:
>
> If A looks like:
> A: {name: chararray,age: int,gpa: float}
>
> And after  "B = GROUP A BY age;"
>
> B will become:
> B: {group: int, A: {name: chararray,age: int,gpa: float}}
>
> Then you can use
> FOREACH B Generate.....
> To get the result you want.
>
> If my explaination is not clear, just take a look at
> http://pig.apache.org/docs/r0.10.0/basic.html#GROUP
>
> Hope this help.
>
> Best,
> Chyi-Kwei
>
> On Sun, Oct 14, 2012 at 1:03 PM, yogesh dhari <[EMAIL PROTECTED]> wrote:
> >
> > Hi CHyi-kwei,
> >
> > Thanks for help, I think I wasn't able to clarify my question
> >
> > The query you wrote
> >
> > It will count the number of occurrence of same NAME, DATE and URL but won't add all hitcount under same name, date, url.
> >
> > I want result like this
> >
> > like :  timesascent.in,     2008-08-27,      http://timesascent.in/      (/*addition of
> > all hitcount under same name, date, url    (37+17+17+27+....)*/  98 )
> >           timesascent.in,       2008-08-27,       http://timesascent.in/section/2/Interviews    (/*addition of
> > all hitcount under same name, date, url    (15+14)*/  29)
> >           .
> >           .
> >           .
> >
> > From this file below
> >
> >       NAME                                 DATE                               URL                                                                  HITCOUNT
> > timesascent.in    2008-08-27    http://timesascent.in/index.aspx?page=tparchives    15
> > timesascent.in    2008-08-27
> > http://timesascent.in/index.aspx?page=article§id=1&contentid=200812182008121814134447219270b26
> >     20
> > timesascent.in    2008-08-27    http://timesascent.in/    37
> > timesascent.in    2008-08-27    http://timesascent.in/section/39/Job%20Wise    14
> > timesascent.in    2008-08-27
> > http://timesascent.in/article/7/2011062120110621171709769aacc537/Work-environment--Employee-productivity.html
> >     20
> > timesascent.in    2008-08-27    http://timesascent.in/    17
> > timesascent.in    2008-08-27    http://timesascent.in/section/2/Interviews    15
> > timesascent.in    2008-08-27    http://timesascent.in/    17
> > timesascent.in    2008-08-27    http://timesascent.in/    27
> > timesascent.in    2008-08-27    http://timesascent.in/    37
> > timesascent.in    2008-08-27    http://timesascent.in/    27
> > timesascent.in    2008-08-27    http://www.timesascent.in/    16
> > timesascent.in    2008-08-27    http://timesascent.in/section/2/Interviews    14
> > timesascent.in    2008-08-27    http://timesascent.in/    14
> > timesascent.in    2008-08-27    http://timesascent.in/    22
> >
> >
> > Please help and suggest how to write query for this in HIVE and  PIG
> >
> > Thanks & Regards
> > Yogesh Kumar Dhari
> >
> >> Date: Sun, 14 Oct 2012 11:31:00 -0400
> >> Subject: Re: NEED HELP in Hive Query
> >> From: [EMAIL PROTECTED]
> >> To: [EMAIL PROTECTED]
> >>
> >> Hi,
> >>
> >> In pig, you can try
> >>
> >> GROUP data BY (NAME, DATE , URL)
> >>
> >> The detail is here:
> >> http://pig.apache.org/docs/r0.10.0/basic.html#GROUP
> >>
> >> Best,
> >> CHyi-kwei
> >>
> >> On Sun, Oct 14, 2012 at 10:54 AM, yogesh dhari <[EMAIL PROTECTED]> wrote:
> >> >
> >> > Hi all,
> >> >
> >> > I have this file. I want this operation to perform in HIVE & PIG
> >> >
> >> >       NAME                  DATE               URL                                                                           HITCOUNT
     
+
Dmitriy Ryaboy 2012-10-18, 04:12
+
chyi-kwei yau 2012-10-14, 20:03