Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - FLATTEN() behavior difference in 0.8.1 and 0.10.0 ?


Copy link to this message
-
Re: FLATTEN() behavior difference in 0.8.1 and 0.10.0 ?
Yang 2012-06-25, 13:45
thanks Robert, I'll try it
On Jun 25, 2012 3:56 AM, "Norbert Burger" <[EMAIL PROTECTED]> wrote:

> Yang -- I think you'll get the representation you're looking for by
> applying the FLATTEN a second time.  Each instance of a FLATTEN strips off
> a single layer.
>
> Norbert
>
> On Sun, Jun 24, 2012 at 5:57 PM, Jonathan Coveney <[EMAIL PROTECTED]
> >wrote:
>
> > generate K.(x1), K.(x2), K.(x3) .... , K.(x100); and generate
> > K(x1,...,x100) are actually very different.
> >
> > The latter is a bag, with columns x1, x2..x100. This is generally what is
> > desired.
> >
> > The former is a bag of column x1, then a bag of column x2, then a bag of
> > column x3, etc. Each will be unordered and independent.
> >
> > 2012/6/24 yonghu <[EMAIL PROTECTED]>
> >
> > > You can also write like
> > >
> > > K1.(x1,x2,...,x100).
> > >
> > > regards!
> > >
> > > Yong
> > >
> > > On Sun, Jun 24, 2012 at 8:40 PM, Yang <[EMAIL PROTECTED]> wrote:
> > > > thanks,
> > > >
> > > > but this is a bit more cumbersome: if I have
> > > >
> > > > generate K.(x1), K.(x2), K.(x3) .... , K.(x100);
> > > >
> > > > I'd have to re-write each xn by adding K.( )
> > > >
> > > >
> > > > it would be nice if the schema of K can strip off the surrounding {(
> > )}.
> > > > actually it should,
> > > > since this is after a FLATTEN()
> > > >
> > > >
> > > > Yang
> > > >
> > > > On Sun, Jun 24, 2012 at 11:17 AM, yonghu <[EMAIL PROTECTED]>
> > wrote:
> > > >
> > > >> So, I think you want to project the x in K. You can write the pig
> as:
> > > >>
> > > >> M = foreach K generate K.(x) as X;
> > > >>
> > > >> Hope this can help you.
> > > >>
> > > >> Yong
> > > >>
> > > >> On Sun, Jun 24, 2012 at 12:40 PM, Yang <[EMAIL PROTECTED]>
> wrote:
> > > >> > my UDF returns a bag of tuples : mybag:bag{ mytuple: tuple ( x:
> int,
> > > >> y:int)}
> > > >> >
> > > >> > in my pig script:
> > > >> >
> > > >> > I do
> > > >> >
> > > >> > K = foreach blah generate UDF( xxx);
> > > >> >
> > > >> > M = foreach K generate x;
> > > >> >
> > > >> >
> > > >> > here PIG 0.8.1 says x can not be found in schema, since
> > > >> >
> > > >> > describe K
> > > >> >
> > > >> > shows:
> > > >> > { mytuple:tuple(x:int , y:int) }
> > > >> >
> > > >> > while 0.10.0
> > > >> >
> > > >> > shows
> > > >> > {x:int, y:int}
> > > >>
> > >
> >
>