Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> should the following query work?


Copy link to this message
-
Re: should the following query work?

Dare I ask why such a query would be used? AFAICT the second group
operation would just stick each record in a bag and create an extra
copy of group on the outside of the bag (but use up a lot more
computational power than a UDF that would just do the same thing
explicitly).

Cheers,
Kris

On Thu, Dec 09, 2010 at 03:34:58PM -0800, Lin Guo wrote:
> A = load 'foo.txt' using PigStorage as (x : chararray, y : int);
>
> B = group A by x;
> C = group B by group;
> describe C;
>
> -- we got
> -- C: {group: chararray,B: {group: chararray,A: {x: chararray,y: int}}}
>
> D = foreach C generate B.(group, A);  -- this works
> describe D;
>
> E = foreach C generate B.(group, A.(x));
> describe E;
> --- pig returns syntax error, but should this work? Or is there a patch for it?
>
> thanks,
> lin

--
Kris Coward http://unripe.melon.org/
GPG Fingerprint: 2BF3 957D 310A FEEC 4733  830E 21A4 05C7 1FEB 12B3
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB