Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - Unable to load data using PigStorage that was previously stored using PigStorage


Copy link to this message
-
Re: Unable to load data using PigStorage that was previously stored using PigStorage
Prashant Kommireddi 2013-04-18, 16:34
Well, let me rephrase - the values all have to be the same type if you
choose to read all columns in a similar way. If you know in advance its
always the value associated with key 'b' that's a bag, why don't you cast
that single value?

B = foreach A generate (bag{})document#'b' as b;
On Thu, Apr 18, 2013 at 7:43 AM, Jerry Lam <[EMAIL PROTECTED]> wrote:

> Hi Prashant:
>
> I read about the map data type in the book "Programming Pig", it says:
> "... By default there is no requirement that all values in a map must be of
> the same type. It is legitimate to have a map with two keys name and age,
> where the value for name is a chararray and the value for age is an int.
> Beginning in Pig 0.9, a map can declare its values to all be of the same
> type... "
>
> I agree that all values in the map can be of the same type but this is not
> required in pig.
>
> Best Regards,
>
> Jerry
>
>
> On Thu, Apr 18, 2013 at 10:37 AM, Jerry Lam <[EMAIL PROTECTED]> wrote:
>
> > Hi Rusian:
> >
> > I used PigStorage to store the data that is originally using Pig data
> > type. It is strange (or a bug in Pig) that I cannot read the data using
> > PigStorage that have been stored using PigStorage, isn't it?
> >
> > Best Regards,
> >
> > Jerry
> >
> >
> >
> > On Wed, Apr 17, 2013 at 10:52 PM, Ruslan Al-Fakikh <[EMAIL PROTECTED]
> >wrote:
> >
> >> The output:
> >> ({ ([c#11,d#22]),([c#33,d#44]) })
> >> ()
> >> looks weird.
> >>
> >> Jerry, maybe the problem is in using PigStorage. As its javadoc says:
> >>
> >> A load function that parses a line of input into fields using a
> character
> >> delimiter
> >>
> >> So I guess this is just for simple csv lines.
> >> But you are trying to load a complicated Map structure as it was
> formatted
> >> by previous storing.
> >> Probably you'll need to write your own Loader for this. Another hint:
> >> using
> >> the -schema paramenter to PigStorage, but I am not sure it can help:(
> >>
> >> Ruslan
> >>
> >>
> >> On Wed, Apr 17, 2013 at 11:48 PM, Jerry Lam <[EMAIL PROTECTED]>
> wrote:
> >>
> >> > Hi Rusian:
> >> >
> >> > I did a describe B followed by a dump B, the output is:
> >> > B: {b: {()}}
> >> >
> >> > ({ ([c#11,d#22]),([c#33,d#44]) })
> >> > ()
> >> >
> >> > but when I executed
> >> >
> >> > C = foreach B generate flatten(b);
> >> >
> >> > dump C;
> >> >
> >> > I got the exception again...
> >> >
> >> > 2013-04-17 15:47:39,933 [Thread-26] WARN
> >> >  org.apache.hadoop.mapred.LocalJobRunner - job_local_0002
> >> > java.lang.Exception: java.lang.ClassCastException:
> >> > org.apache.pig.data.DataByteArray cannot be cast to
> >> > org.apache.pig.data.DataBag
> >> > at
> >> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:400)
> >> > Caused by: java.lang.ClassCastException:
> >> org.apache.pig.data.DataByteArray
> >> > cannot be cast to org.apache.pig.data.DataBag
> >> > at
> >> >
> >> >
> >>
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.processInputBag(POProject.java:586)
> >> > at
> >> >
> >> >
> >>
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:250)
> >> > at
> >> >
> >> >
> >>
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:334)
> >> > at
> >> >
> >> >
> >>
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:372)
> >> > at
> >> >
> >> >
> >>
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:297)
> >> > at
> >> >
> >> >
> >>
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:283)
> >> > at
> >> >
> >> >
> >>
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:278)
> >> > at
> >> >
> >> >
> >>
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)