Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - FLATTEN(bag_of_tuples) error in 0.8.1 ?


Copy link to this message
-
Re: FLATTEN(bag_of_tuples) error in 0.8.1 ?
Yang 2012-07-19, 00:48
we use cdh3u3,

unfortunately due to company ops experience, we'd have to stick to cdh3u3
and pig 0.8.1

On Wed, Jul 18, 2012 at 5:39 PM, Jonathan Coveney <[EMAIL PROTECTED]>wrote:

> pig 0.8.1 isn't really seeing any active development at all. Is there a
> reason why you can't use 0.10.0?
>
> 2012/7/18 Yang <[EMAIL PROTECTED]>
>
> > this actually caused a rather nasty bug today.
> >
> >
> > in another udf that returns a bag of tuples, originally I inserted the
> > tuple into a fieldschema inside the bag,
> > and the schema for FLATTEN(myudf()) as
> >
> > mytuple::field1, mytuple::field2,
> >
> >
> > but actually the values of all the fields are expanded into the root
> level,
> > and overwrote another field having the same value, but without the
> > "mytuple::" part
> >
> > this is on 0.8.1
> >
> >
> >
> >
> > On Tue, Jul 17, 2012 at 11:25 PM, Jonathan Coveney <[EMAIL PROTECTED]
> > >wrote:
> >
> > > In 0.10 you should have to have bag -> tuple -> elments
> > >
> > > 2012/7/17 Yang <[EMAIL PROTECTED]>
> > >
> > > > ok, found the issue,
> > > >
> > > > now I do not create an explicit FieldSchema for the inside tuple
> > Schema,
> > > > but directly insert the tuple schema into
> > > > the bag. then it works.
> > > >
> > > > this is indeed some difference between 081 and 0.10, cuz the original
> > > works
> > > > on 0.10, and the new one only works on 0.8.1
> > > >
> > > > On Tue, Jul 17, 2012 at 4:59 PM, Yang <[EMAIL PROTECTED]> wrote:
> > > >
> > > > > I created a Udf that returns a Bag of Tuples.  the syntax is all
> > fine,
> > > > but
> > > > > when I run it in pig,
> > > > > Pig gives error:
> > > > > 2/07/17 16:51:58 INFO jvm.JvmMetrics: Cannot initialize JVM Metrics
> > > with
> > > > > processName=JobTracker, sessionId= - already initialized
> > > > > 12/07/17 16:51:58 WARN mapred.LocalJobRunner: job_local_0001
> > > > > java.lang.ClassCastException: java.lang.String cannot be cast to
> > > > > org.apache.pig.data.Tuple
> > > > > at
> > > > >
> > > >
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:392)
> > > > >  at
> > > > >
> > > >
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:342)
> > > > > at
> > > > >
> > > >
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:290)
> > > > >  at
> > > > >
> > > >
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:237)
> > > > > at
> > > > >
> > > >
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:232)
> > > > >  at
> > > > >
> > > >
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:53)
> > > > > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
> > > > >  at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
> > > > > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
> > > > >  at
> > > > >
> > >
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
> > > > > 12/07/17 16:51:58 INFO mapReduceLayer.MapReduceLauncher:
> HadoopJobId:
> > > > > job_local_0001
> > > > >
> > > > >
> > > > >
> > > > > it looks that the returned value is wrong somehow. but I checked
> the
> > > > > outputSchema() method, and it is exactly the same as
> > > > > online docs. where am I wrong?
> > > > > ---- this is pig 0.8.1 .       I posted a question about 1 month
> ago,
> > > > > stating that 0.8.1 FLATTEN(bag_of_tuples) behavior is different
> from
> > > > > 0.10.0, in that
> > > > > it keeps the enclosing tuple, while 0.10.0 strips it and places the
> > > > fields
> > > > > at the root level.
> > > > >
> > > > >
> > > > >
> > > > > Thanks!
> > > > > yang
> > > > >
> > > > > ///// DemoUdf.java
> > > > >
> > > > > import java.io.IOException;
> >