Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> FLATTEN() behavior difference in 0.8.1 and 0.10.0 ?


Copy link to this message
-
Re: FLATTEN() behavior difference in 0.8.1 and 0.10.0 ?
actually FLATTEN(FLATTEN(....)) is not syntactically correct , at least in
0.8. also semantically it's not what I wanted either, cuz FLATTEN works on
bags, while I wanted to project ALL fields of a tuple.

I ended up adding a T:tuple(  ) to the AS clause, and adding an explicit
projection after the udf call.

Thanks
Yang

On Mon, Jun 25, 2012 at 6:45 AM, Yang <[EMAIL PROTECTED]> wrote:

> thanks Robert, I'll try it
> On Jun 25, 2012 3:56 AM, "Norbert Burger" <[EMAIL PROTECTED]>
> wrote:
>
>> Yang -- I think you'll get the representation you're looking for by
>> applying the FLATTEN a second time.  Each instance of a FLATTEN strips off
>> a single layer.
>>
>> Norbert
>>
>> On Sun, Jun 24, 2012 at 5:57 PM, Jonathan Coveney <[EMAIL PROTECTED]
>> >wrote:
>>
>> > generate K.(x1), K.(x2), K.(x3) .... , K.(x100); and generate
>> > K(x1,...,x100) are actually very different.
>> >
>> > The latter is a bag, with columns x1, x2..x100. This is generally what
>> is
>> > desired.
>> >
>> > The former is a bag of column x1, then a bag of column x2, then a bag of
>> > column x3, etc. Each will be unordered and independent.
>> >
>> > 2012/6/24 yonghu <[EMAIL PROTECTED]>
>> >
>> > > You can also write like
>> > >
>> > > K1.(x1,x2,...,x100).
>> > >
>> > > regards!
>> > >
>> > > Yong
>> > >
>> > > On Sun, Jun 24, 2012 at 8:40 PM, Yang <[EMAIL PROTECTED]> wrote:
>> > > > thanks,
>> > > >
>> > > > but this is a bit more cumbersome: if I have
>> > > >
>> > > > generate K.(x1), K.(x2), K.(x3) .... , K.(x100);
>> > > >
>> > > > I'd have to re-write each xn by adding K.( )
>> > > >
>> > > >
>> > > > it would be nice if the schema of K can strip off the surrounding {(
>> > )}.
>> > > > actually it should,
>> > > > since this is after a FLATTEN()
>> > > >
>> > > >
>> > > > Yang
>> > > >
>> > > > On Sun, Jun 24, 2012 at 11:17 AM, yonghu <[EMAIL PROTECTED]>
>> > wrote:
>> > > >
>> > > >> So, I think you want to project the x in K. You can write the pig
>> as:
>> > > >>
>> > > >> M = foreach K generate K.(x) as X;
>> > > >>
>> > > >> Hope this can help you.
>> > > >>
>> > > >> Yong
>> > > >>
>> > > >> On Sun, Jun 24, 2012 at 12:40 PM, Yang <[EMAIL PROTECTED]>
>> wrote:
>> > > >> > my UDF returns a bag of tuples : mybag:bag{ mytuple: tuple ( x:
>> int,
>> > > >> y:int)}
>> > > >> >
>> > > >> > in my pig script:
>> > > >> >
>> > > >> > I do
>> > > >> >
>> > > >> > K = foreach blah generate UDF( xxx);
>> > > >> >
>> > > >> > M = foreach K generate x;
>> > > >> >
>> > > >> >
>> > > >> > here PIG 0.8.1 says x can not be found in schema, since
>> > > >> >
>> > > >> > describe K
>> > > >> >
>> > > >> > shows:
>> > > >> > { mytuple:tuple(x:int , y:int) }
>> > > >> >
>> > > >> > while 0.10.0
>> > > >> >
>> > > >> > shows
>> > > >> > {x:int, y:int}
>> > > >>
>> > >
>> >
>>
>