-Re: "duplicate uid in schema" feature or bug?
Norbert Burger 2012-04-10, 14:53
Not sure if this will work in your use-case, but adding a FLATTEN to strip
the outer tuple before the FOREACHs seems to detour Pig enough to work
around the bug:
B = FOREACH A GENERATE FLATTEN(a);
B1 = FOREACH B GENERATE x, y;
B2 = FOREACH B GENERATE x, y;
On Tue, Apr 10, 2012 at 2:42 AM, Jonathan Coveney <[EMAIL PROTECTED]>wrote:
> This is indeed a bug (and a pretty nasty, if infrequent, one). Thank you
> for filing the JIRA!
> 2012/4/9 Peter Gieser <[EMAIL PROTECTED]>
> > I have created a bug (https://issues.apache.org/jira/browse/PIG-2636)
> > based on the following (simplified) script:
> > A = LOAD 'bug.in' AS a:tuple(x:int, y:int);
> > B1 = FOREACH A GENERATE a.x, a.y;
> > B2 = FOREACH A GENERATE a.x, a.y;
> > C = JOIN B1 BY x, B2 by x;
> > that yields the following error:
> > org.apache.pig.impl.plan.PlanValidationException: ERROR 2270: Logical
> > invalid state: duplicate uid in schema :
> > B1::x#35:int,B1::y#36:int,B2::x#35:int,B2::y#36:int
> > I assumed this was a bug, but perhaps pig is not meant to support this?
> > Is there an easy way to achieve the result if it turns out to be
> > unsupported?
> > Thanks,
> > Pete