Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Generating multiple tuples from single tuple


Copy link to this message
-
Re: Generating multiple tuples from single tuple
You can probably hack together something that will do exactly this without
writing a UDF, but I think a UDF will be most useful here...especially if
you want to add more columns, etc etc.

2012/7/1 Subir S <[EMAIL PROTECTED]>

> Would FLATTEN help?
>
> B = GROUP A by ID;
>
> C = FOREACH B GENERATE group, FLATTEN ($1);
>
> Might work i guess. Not tested.
>
> On Mon, Jul 2, 2012 at 8:04 AM, naresh <[EMAIL PROTECTED]> wrote:
>
> > Hi,
> >
> >         I am new to pig scripting. I like to generate multiple tuples
> from
> > a single tuple. What I mean is:
> >
> > I have file with following data in it.
> >
> > >> cat data
> >
> > ID | ColumnName1:Value1 | ColumnName2:Value2
> >
> > so I load it by the following command
> >
> > grunt >> A = load '$data' using PigStorage('|');
> >
> > grunt >> dump A;
> >
> > (ID,ColumnName1:Value1,ColumnName2:Value2)
> >
> > Now I want to split this tuple into two tuples.
> >
> > (ID, ColumnName1, Value1)
> > (ID, ColumnName2, Value2)
> >
> > Can I use UDF along with foreach and generate. Some thing like the
> > following?
> >
> > grunt >> foreach A generate SOMEUDF(A)
> >
> > Thanks for your time,
> > Naresh.
> >
>