Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Unable to load data using PigStorage that was previously stored using PigStorage


Copy link to this message
-
Re: Unable to load data using PigStorage that was previously stored using PigStorage
I think that before doing the FLATTEN, you should be 100% sure that your
cast worked properly. Can you first DESCRIBE B and then DUMP B right away?
Or probably it just can't be cast in this way. Honestly I don't know
exactly how it works, but here:
http://pig.apache.org/docs/r0.10.0/basic.html#cast
I see that casting from a map to a bag should produce an error.
Hope that helps.
On Wed, Apr 17, 2013 at 9:38 PM, Jerry Lam <[EMAIL PROTECTED]> wrote:

> Hi Rusian:
>
> Thanks for your help. I really appreciate it. It really puzzled me.
>
> I did a "describe B", the output is "B: {b: bytearray}".
>
> I then tried to cast it as suggested, I got:
> B = foreach A generate document#'b' as b:{};
> describe B;
> B: {b: {()}}
>
> Then I proceed with:
> C = foreach B generate flatten(b);
>
> I got:
> 2013-04-17 13:38:04,601 [Thread-16] WARN
>  org.apache.hadoop.mapred.LocalJobRunner - job_local_0002
> java.lang.Exception: java.lang.ClassCastException:
> org.apache.pig.data.DataByteArray cannot be cast to
> org.apache.pig.data.DataBag
> at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:400)
> Caused by: java.lang.ClassCastException: org.apache.pig.data.DataByteArray
> cannot be cast to org.apache.pig.data.DataBag
> at
>
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.processInputBag(POProject.java:586)
> at
>
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:250)
> at
>
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:334)
> at
>
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:372)
> at
>
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:297)
> at
>
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:283)
> at
>
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:278)
> at
>
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:725)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
> at
>
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:232)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
> at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
> at java.lang.Thread.run(Thread.java:680)
>
> Best Regards,
>
> Jerry
>
>
> On Wed, Apr 17, 2013 at 1:24 PM, Ruslan Al-Fakikh <[EMAIL PROTECTED]
> >wrote:
>
> > Hey, and as for converting a map of tuples, probably i got you wrong. If
> > you can get to every value manually withing FOREACH then I see no problem
> > in doing so.
> >
> >
> > On Wed, Apr 17, 2013 at 9:22 PM, Ruslan Al-Fakikh <[EMAIL PROTECTED]
> > >wrote:
> >
> > > I am not sure whether you can convert a map to a tuple.
> > > But I am curious about one thing:
> > > your are trying to use 'b' as a Bag, right? Because FLATTEN needs it to
> > be
> > > a Bag I guess:
> > > http://pig.apache.org/docs/r0.10.0/basic.html#flatten
> > > But it seems that Pig thinks that b is a byte array:
> > > java.lang.ClassCastException: org.apache.pig.data.DataByteArray cannot
> be
> > > cast to org.apache.pig.data.DataBag
> > > Can you do this?:
> > > DESCRIBE B
> > >
> > > I suppose it can look like a Bag in the output of DUMP, but I think Pig
> > > doesn't know it is a Bag, maybe you'll need some kind of explicit cast?