Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> Applying schemas after flatten?


+
David Lapsley 2012-09-07, 22:41
Copy link to this message
-
Re: Applying schemas after flatten?
Hi Dave
try
C = FOREACH B generate(t.y, Flatten(t.CUSTS) AS (anothery:chararray,
custbag:bag));

On Sat, Sep 8, 2012 at 8:41 AM, David Lapsley <[EMAIL PROTECTED]>wrote:

> Hi Folks:
>
> I am new to the pig world. I have been using it for about a week and I am
> completely blown away with how good it is.
>
> I have a question about Schemas. I have a processing chain similar to the
> following:
>
> A = LOAD 'data' USING PigStorage('\u0001') AS (y:chararray, cust1:int,
> cust2:int);
> B = FOREACH A GENERATE (y, {(cust1), (cust2)}) AS t: tuple(y, CUSTS);
> C = FOREACH B GENERATE(t.y, FLATTEN(t.CUSTS));
>
> So, basically, my raw data contains multiple customer records per row, and
> some common data. I would like to "explode" each row, so that I have one
> row per customer data (which also includes the common data).
>
> The code above does this, however, I am not able to supply a schema for C.
> Whenever I try to do this, I get an error regarding mismatched schemas.
>
> I would greatly appreciate any pointers you may have.
>
> Best regards,
>
> Dave.
>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB