Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - Unable to load data using PigStorage that was previously stored using PigStorage


Copy link to this message
-
Re: Unable to load data using PigStorage that was previously stored using PigStorage
Ruslan Al-Fakikh 2013-04-17, 17:22
I am not sure whether you can convert a map to a tuple.
But I am curious about one thing:
your are trying to use 'b' as a Bag, right? Because FLATTEN needs it to be
a Bag I guess:
http://pig.apache.org/docs/r0.10.0/basic.html#flatten
But it seems that Pig thinks that b is a byte array:
java.lang.ClassCastException: org.apache.pig.data.DataByteArray cannot be
cast to org.apache.pig.data.DataBag
Can you do this?:
DESCRIBE B

I suppose it can look like a Bag in the output of DUMP, but I think Pig
doesn't know it is a Bag, maybe you'll need some kind of explicit cast?
On Wed, Apr 17, 2013 at 9:11 PM, Jerry Lam <[EMAIL PROTECTED]> wrote:

> Hi Rusian,
>
> I tried to debug each step already but no luck.
> I did a dump (dump B;) after B = foreach A generate document#'b' as b;
> I got {([c#11,d#22]),([c#33,d#44])}
> but it fails when I did C = foreach B generate flatten(b);
>
> I don't have controls over the input. It is passed as Map of Maps. I guess
> it makes lookup easier using a map with keys.
>
> Can I convert map to tuple?
>
> Best Regards,
>
> Jerry
>
>
>
> On Wed, Apr 17, 2013 at 11:57 AM, Ruslan Al-Fakikh <[EMAIL PROTECTED]
> >wrote:
>
> > Hi Jerry,
> >
> > I would recommend to debug the issue step by step. Just after this line:
> > A = load 'data.txt' as document:[];
> > and then right after that:
> > DESCRIBE A;
> > DUMP A;
> > and so on...
> >
> > To be honest I haven't used maps that much. Just curious, why did you
> > choose to use them? You can also use regular tuples for storing the
> > relations. Also you can store the tuples with a schema file.
> >
> > Ruslan
> >
> >
> > On Wed, Apr 17, 2013 at 5:28 AM, Jerry Lam <[EMAIL PROTECTED]> wrote:
> >
> > > Hi pig users,
> > >
> > > I tried to load data using PigStorage that was previously stored using
> > > PigStorage but it failed.
> > >
> > > Each line looks like this in the data file that is generated by
> > PigStorage:
> > > [a#hello,b#{([c#11,d#22]),([c#33,d#44])}]
> > >
> > > I did the following:
> > > A = load 'data.txt' as document:[];
> > > B = foreach A generate document#'b' as b;
> > > C = foreach B generate flatten(b);
> > > dump C;
> > >
> > > I expect to see the following output:
> > > ([c#11,d#22])
> > > ([c#33,d#44])
> > >
> > > Instead, I got:
> > > java.lang.ClassCastException: org.apache.pig.data.DataByteArray cannot
> be
> > > cast to org.apache.pig.data.DataBag
> > >
> > > Anyone encounters this problem before? How can I read the data back?
> > >
> > > Thanks,
> > >
> > > Jerry
> > >
> >
>