Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> "duplicate uid in schema" feature or bug?


+
Peter Gieser 2012-04-10, 06:26
+
Jonathan Coveney 2012-04-10, 06:42
Copy link to this message
-
Re: "duplicate uid in schema" feature or bug?
Not sure if this will work in your use-case, but adding a FLATTEN to strip
the outer tuple before the FOREACHs seems to detour Pig enough to work
around the bug:

B = FOREACH A GENERATE FLATTEN(a);
B1 = FOREACH B GENERATE x, y;
B2 = FOREACH B GENERATE x, y;

Norbert

On Tue, Apr 10, 2012 at 2:42 AM, Jonathan Coveney <[EMAIL PROTECTED]>wrote:

> This is indeed a bug (and a pretty nasty, if infrequent, one). Thank you
> for filing the JIRA!
>
> 2012/4/9 Peter Gieser <[EMAIL PROTECTED]>
>
> > I have created a bug (https://issues.apache.org/jira/browse/PIG-2636)
> > based on the following (simplified) script:
> >
> > A = LOAD 'bug.in' AS a:tuple(x:int, y:int);
> > B1 = FOREACH A GENERATE a.x, a.y;
> > B2 = FOREACH A GENERATE a.x, a.y;
> > C = JOIN B1 BY x, B2 by x;
> >
> > that yields the following error:
> >
> > org.apache.pig.impl.plan.PlanValidationException: ERROR 2270: Logical
> plan
> > invalid state: duplicate uid in schema :
> > B1::x#35:int,B1::y#36:int,B2::x#35:int,B2::y#36:int
> >
> >
> > I assumed this was a bug, but perhaps pig is not meant to support this?
> >  Is there an easy way to achieve the result if it turns out to be
> > unsupported?
> >
> > Thanks,
> > Pete
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB