Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> Unable to load data using PigStorage that was previously stored using PigStorage


+
Jerry Lam 2013-04-17, 01:28
+
Ruslan Al-Fakikh 2013-04-17, 15:57
+
Jerry Lam 2013-04-17, 17:11
+
Ruslan Al-Fakikh 2013-04-17, 17:22
+
Ruslan Al-Fakikh 2013-04-17, 17:24
+
Jerry Lam 2013-04-17, 17:38
+
Ruslan Al-Fakikh 2013-04-17, 19:26
+
Jerry Lam 2013-04-17, 19:48
Copy link to this message
-
Re: Unable to load data using PigStorage that was previously stored using PigStorage
The output:
({ ([c#11,d#22]),([c#33,d#44]) })
()
looks weird.

Jerry, maybe the problem is in using PigStorage. As its javadoc says:

A load function that parses a line of input into fields using a character
delimiter

So I guess this is just for simple csv lines.
But you are trying to load a complicated Map structure as it was formatted
by previous storing.
Probably you'll need to write your own Loader for this. Another hint: using
the -schema paramenter to PigStorage, but I am not sure it can help:(

Ruslan
On Wed, Apr 17, 2013 at 11:48 PM, Jerry Lam <[EMAIL PROTECTED]> wrote:

> Hi Rusian:
>
> I did a describe B followed by a dump B, the output is:
> B: {b: {()}}
>
> ({ ([c#11,d#22]),([c#33,d#44]) })
> ()
>
> but when I executed
>
> C = foreach B generate flatten(b);
>
> dump C;
>
> I got the exception again...
>
> 2013-04-17 15:47:39,933 [Thread-26] WARN
>  org.apache.hadoop.mapred.LocalJobRunner - job_local_0002
> java.lang.Exception: java.lang.ClassCastException:
> org.apache.pig.data.DataByteArray cannot be cast to
> org.apache.pig.data.DataBag
> at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:400)
> Caused by: java.lang.ClassCastException: org.apache.pig.data.DataByteArray
> cannot be cast to org.apache.pig.data.DataBag
> at
>
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.processInputBag(POProject.java:586)
> at
>
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:250)
> at
>
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:334)
> at
>
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:372)
> at
>
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:297)
> at
>
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:283)
> at
>
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:278)
> at
>
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:725)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
> at
>
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:232)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
> at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
> at java.lang.Thread.run(Thread.java:680)
>
>
> Best Regards,
>
> Jerry
>
>
> On Wed, Apr 17, 2013 at 3:26 PM, Ruslan Al-Fakikh <[EMAIL PROTECTED]
> >wrote:
>
> > I think that before doing the FLATTEN, you should be 100% sure that your
> > cast worked properly. Can you first DESCRIBE B and then DUMP B right
> away?
> > Or probably it just can't be cast in this way. Honestly I don't know
> > exactly how it works, but here:
> > http://pig.apache.org/docs/r0.10.0/basic.html#cast
> > I see that casting from a map to a bag should produce an error.
> > Hope that helps.
> >
> >
> > On Wed, Apr 17, 2013 at 9:38 PM, Jerry Lam <[EMAIL PROTECTED]> wrote:
> >
> > > Hi Rusian:
> > >
> > > Thanks for your help. I really appreciate it. It really puzzled me.
> > >
> > > I did a "describe B", the output is "B: {b: bytearray}".
> > >
> > > I then tried to cast it as suggested, I got:
> > > B = foreach A generate document#'b' as b:{};
> > > describe B;
> > > B: {b: {()}}
> > >
> > > Then I proceed with:
> > > C = foreach B generate flatten(b);
+
Jerry Lam 2013-04-18, 14:37
+
Jerry Lam 2013-04-18, 14:43
+
Prashant Kommireddi 2013-04-18, 16:34
+
Jerry Lam 2013-04-18, 21:14
+
Prashant Kommireddi 2013-04-18, 21:41
+
Jerry Lam 2013-04-18, 22:57
+
Ruslan Al-Fakikh 2013-04-19, 20:56
+
Jerry Lam 2013-04-20, 00:39
+
Prashant Kommireddi 2013-04-18, 06:56
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB