Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Unable to load data using PigStorage that was previously stored using PigStorage


Copy link to this message
-
Re: Unable to load data using PigStorage that was previously stored using PigStorage
Hi Rusian:

Thanks for your help. I really appreciate it. It really puzzled me.

I did a "describe B", the output is "B: {b: bytearray}".

I then tried to cast it as suggested, I got:
B = foreach A generate document#'b' as b:{};
describe B;
B: {b: {()}}

Then I proceed with:
C = foreach B generate flatten(b);

I got:
2013-04-17 13:38:04,601 [Thread-16] WARN
 org.apache.hadoop.mapred.LocalJobRunner - job_local_0002
java.lang.Exception: java.lang.ClassCastException:
org.apache.pig.data.DataByteArray cannot be cast to
org.apache.pig.data.DataBag
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:400)
Caused by: java.lang.ClassCastException: org.apache.pig.data.DataByteArray
cannot be cast to org.apache.pig.data.DataBag
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.processInputBag(POProject.java:586)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:250)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:334)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:372)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:297)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:283)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:278)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:725)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
at
org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:232)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
at java.lang.Thread.run(Thread.java:680)

Best Regards,

Jerry
On Wed, Apr 17, 2013 at 1:24 PM, Ruslan Al-Fakikh <[EMAIL PROTECTED]>wrote:

> Hey, and as for converting a map of tuples, probably i got you wrong. If
> you can get to every value manually withing FOREACH then I see no problem
> in doing so.
>
>
> On Wed, Apr 17, 2013 at 9:22 PM, Ruslan Al-Fakikh <[EMAIL PROTECTED]
> >wrote:
>
> > I am not sure whether you can convert a map to a tuple.
> > But I am curious about one thing:
> > your are trying to use 'b' as a Bag, right? Because FLATTEN needs it to
> be
> > a Bag I guess:
> > http://pig.apache.org/docs/r0.10.0/basic.html#flatten
> > But it seems that Pig thinks that b is a byte array:
> > java.lang.ClassCastException: org.apache.pig.data.DataByteArray cannot be
> > cast to org.apache.pig.data.DataBag
> > Can you do this?:
> > DESCRIBE B
> >
> > I suppose it can look like a Bag in the output of DUMP, but I think Pig
> > doesn't know it is a Bag, maybe you'll need some kind of explicit cast?
> >
> >
> > On Wed, Apr 17, 2013 at 9:11 PM, Jerry Lam <[EMAIL PROTECTED]> wrote:
> >
> >> Hi Rusian,
> >>
> >> I tried to debug each step already but no luck.
> >> I did a dump (dump B;) after B = foreach A generate document#'b' as b;
> >> I got {([c#11,d#22]),([c#33,d#44])}
> >> but it fails when I did C = foreach B generate flatten(b);
> >>
> >> I don't have controls over the input. It is passed as Map of Maps. I
> guess
> >> it makes lookup easier using a map with keys.
> >>
> >> Can I convert map to tuple?
> >>
> >> Best Regards,
> >>
> >> Jerry
> >>
> >>
> >>
> >> On Wed, Apr 17, 2013 at 11:57 AM, Ruslan Al-Fakikh <
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB