Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig, mail # user - "duplicate uid in schema" feature or bug?


+
Peter Gieser 2012-04-10, 06:26
+
Jonathan Coveney 2012-04-10, 06:42
Copy link to this message
-
Re: "duplicate uid in schema" feature or bug?
Norbert Burger 2012-04-10, 14:53
Not sure if this will work in your use-case, but adding a FLATTEN to strip
the outer tuple before the FOREACHs seems to detour Pig enough to work
around the bug:

B = FOREACH A GENERATE FLATTEN(a);
B1 = FOREACH B GENERATE x, y;
B2 = FOREACH B GENERATE x, y;

Norbert

On Tue, Apr 10, 2012 at 2:42 AM, Jonathan Coveney <[EMAIL PROTECTED]>wrote:

> This is indeed a bug (and a pretty nasty, if infrequent, one). Thank you
> for filing the JIRA!
>
> 2012/4/9 Peter Gieser <[EMAIL PROTECTED]>
>
> > I have created a bug (https://issues.apache.org/jira/browse/PIG-2636)
> > based on the following (simplified) script:
> >
> > A = LOAD 'bug.in' AS a:tuple(x:int, y:int);
> > B1 = FOREACH A GENERATE a.x, a.y;
> > B2 = FOREACH A GENERATE a.x, a.y;
> > C = JOIN B1 BY x, B2 by x;
> >
> > that yields the following error:
> >
> > org.apache.pig.impl.plan.PlanValidationException: ERROR 2270: Logical
> plan
> > invalid state: duplicate uid in schema :
> > B1::x#35:int,B1::y#36:int,B2::x#35:int,B2::y#36:int
> >
> >
> > I assumed this was a bug, but perhaps pig is not meant to support this?
> >  Is there an easy way to achieve the result if it turns out to be
> > unsupported?
> >
> > Thanks,
> > Pete
>