Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Applying schemas after flatten?


Copy link to this message
-
Re: Applying schemas after flatten?
Hi Dave
try
C = FOREACH B generate(t.y, Flatten(t.CUSTS) AS (anothery:chararray,
custbag:bag));

On Sat, Sep 8, 2012 at 8:41 AM, David Lapsley <[EMAIL PROTECTED]>wrote:

> Hi Folks:
>
> I am new to the pig world. I have been using it for about a week and I am
> completely blown away with how good it is.
>
> I have a question about Schemas. I have a processing chain similar to the
> following:
>
> A = LOAD 'data' USING PigStorage('\u0001') AS (y:chararray, cust1:int,
> cust2:int);
> B = FOREACH A GENERATE (y, {(cust1), (cust2)}) AS t: tuple(y, CUSTS);
> C = FOREACH B GENERATE(t.y, FLATTEN(t.CUSTS));
>
> So, basically, my raw data contains multiple customer records per row, and
> some common data. I would like to "explode" each row, so that I have one
> row per customer data (which also includes the common data).
>
> The code above does this, however, I am not able to supply a schema for C.
> Whenever I try to do this, I get an error regarding mismatched schemas.
>
> I would greatly appreciate any pointers you may have.
>
> Best regards,
>
> Dave.
>
>