Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Unable to load data using PigStorage that was previously stored using PigStorage


Copy link to this message
-
Re: Unable to load data using PigStorage that was previously stored using PigStorage
Hey, and as for converting a map of tuples, probably i got you wrong. If
you can get to every value manually withing FOREACH then I see no problem
in doing so.
On Wed, Apr 17, 2013 at 9:22 PM, Ruslan Al-Fakikh <[EMAIL PROTECTED]>wrote:

> I am not sure whether you can convert a map to a tuple.
> But I am curious about one thing:
> your are trying to use 'b' as a Bag, right? Because FLATTEN needs it to be
> a Bag I guess:
> http://pig.apache.org/docs/r0.10.0/basic.html#flatten
> But it seems that Pig thinks that b is a byte array:
> java.lang.ClassCastException: org.apache.pig.data.DataByteArray cannot be
> cast to org.apache.pig.data.DataBag
> Can you do this?:
> DESCRIBE B
>
> I suppose it can look like a Bag in the output of DUMP, but I think Pig
> doesn't know it is a Bag, maybe you'll need some kind of explicit cast?
>
>
> On Wed, Apr 17, 2013 at 9:11 PM, Jerry Lam <[EMAIL PROTECTED]> wrote:
>
>> Hi Rusian,
>>
>> I tried to debug each step already but no luck.
>> I did a dump (dump B;) after B = foreach A generate document#'b' as b;
>> I got {([c#11,d#22]),([c#33,d#44])}
>> but it fails when I did C = foreach B generate flatten(b);
>>
>> I don't have controls over the input. It is passed as Map of Maps. I guess
>> it makes lookup easier using a map with keys.
>>
>> Can I convert map to tuple?
>>
>> Best Regards,
>>
>> Jerry
>>
>>
>>
>> On Wed, Apr 17, 2013 at 11:57 AM, Ruslan Al-Fakikh <[EMAIL PROTECTED]
>> >wrote:
>>
>> > Hi Jerry,
>> >
>> > I would recommend to debug the issue step by step. Just after this line:
>> > A = load 'data.txt' as document:[];
>> > and then right after that:
>> > DESCRIBE A;
>> > DUMP A;
>> > and so on...
>> >
>> > To be honest I haven't used maps that much. Just curious, why did you
>> > choose to use them? You can also use regular tuples for storing the
>> > relations. Also you can store the tuples with a schema file.
>> >
>> > Ruslan
>> >
>> >
>> > On Wed, Apr 17, 2013 at 5:28 AM, Jerry Lam <[EMAIL PROTECTED]>
>> wrote:
>> >
>> > > Hi pig users,
>> > >
>> > > I tried to load data using PigStorage that was previously stored using
>> > > PigStorage but it failed.
>> > >
>> > > Each line looks like this in the data file that is generated by
>> > PigStorage:
>> > > [a#hello,b#{([c#11,d#22]),([c#33,d#44])}]
>> > >
>> > > I did the following:
>> > > A = load 'data.txt' as document:[];
>> > > B = foreach A generate document#'b' as b;
>> > > C = foreach B generate flatten(b);
>> > > dump C;
>> > >
>> > > I expect to see the following output:
>> > > ([c#11,d#22])
>> > > ([c#33,d#44])
>> > >
>> > > Instead, I got:
>> > > java.lang.ClassCastException: org.apache.pig.data.DataByteArray
>> cannot be
>> > > cast to org.apache.pig.data.DataBag
>> > >
>> > > Anyone encounters this problem before? How can I read the data back?
>> > >
>> > > Thanks,
>> > >
>> > > Jerry
>> > >
>> >
>>
>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB