Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Unable to load data using PigStorage that was previously stored using PigStorage


Copy link to this message
-
Re: Unable to load data using PigStorage that was previously stored using PigStorage
Hi Jerry,

Like I mentioned in my earlier email "Map values by default are bytearrays.
If you need them to be any other type, you would need to define it
explicitly."

Difference in the 2 statements is one does a cast to "bag" and the other is
a bytearray (default).
On Thu, Apr 18, 2013 at 2:14 PM, Jerry Lam <[EMAIL PROTECTED]> wrote:

> Hi Prashant:
>
> IT WORKS! THANKS!
> What is the difference between :
> "B = foreach A generate (bag{})document#'b' as b;
> and
> B = foreach A generate document#'b' as b:bag{};"
> ?
>
> The latter gives error: java.lang.ClassCastException:
> org.apache.pig.data.DataByteArray cannot be cast to
> org.apache.pig.data.DataBag
>
> Best Regards,
>
> Jerry
>
>
> On Thu, Apr 18, 2013 at 12:34 PM, Prashant Kommireddi
> <[EMAIL PROTECTED]>wrote:
>
> > Well, let me rephrase - the values all have to be the same type if you
> > choose to read all columns in a similar way. If you know in advance its
> > always the value associated with key 'b' that's a bag, why don't you cast
> > that single value?
> >
> > B = foreach A generate (bag{})document#'b' as b;
> >
> >
> > On Thu, Apr 18, 2013 at 7:43 AM, Jerry Lam <[EMAIL PROTECTED]> wrote:
> >
> > > Hi Prashant:
> > >
> > > I read about the map data type in the book "Programming Pig", it says:
> > > "... By default there is no requirement that all values in a map must
> be
> > of
> > > the same type. It is legitimate to have a map with two keys name and
> age,
> > > where the value for name is a chararray and the value for age is an
> int.
> > > Beginning in Pig 0.9, a map can declare its values to all be of the
> same
> > > type... "
> > >
> > > I agree that all values in the map can be of the same type but this is
> > not
> > > required in pig.
> > >
> > > Best Regards,
> > >
> > > Jerry
> > >
> > >
> > > On Thu, Apr 18, 2013 at 10:37 AM, Jerry Lam <[EMAIL PROTECTED]>
> > wrote:
> > >
> > > > Hi Rusian:
> > > >
> > > > I used PigStorage to store the data that is originally using Pig data
> > > > type. It is strange (or a bug in Pig) that I cannot read the data
> using
> > > > PigStorage that have been stored using PigStorage, isn't it?
> > > >
> > > > Best Regards,
> > > >
> > > > Jerry
> > > >
> > > >
> > > >
> > > > On Wed, Apr 17, 2013 at 10:52 PM, Ruslan Al-Fakikh <
> > [EMAIL PROTECTED]
> > > >wrote:
> > > >
> > > >> The output:
> > > >> ({ ([c#11,d#22]),([c#33,d#44]) })
> > > >> ()
> > > >> looks weird.
> > > >>
> > > >> Jerry, maybe the problem is in using PigStorage. As its javadoc
> says:
> > > >>
> > > >> A load function that parses a line of input into fields using a
> > > character
> > > >> delimiter
> > > >>
> > > >> So I guess this is just for simple csv lines.
> > > >> But you are trying to load a complicated Map structure as it was
> > > formatted
> > > >> by previous storing.
> > > >> Probably you'll need to write your own Loader for this. Another
> hint:
> > > >> using
> > > >> the -schema paramenter to PigStorage, but I am not sure it can
> help:(
> > > >>
> > > >> Ruslan
> > > >>
> > > >>
> > > >> On Wed, Apr 17, 2013 at 11:48 PM, Jerry Lam <[EMAIL PROTECTED]>
> > > wrote:
> > > >>
> > > >> > Hi Rusian:
> > > >> >
> > > >> > I did a describe B followed by a dump B, the output is:
> > > >> > B: {b: {()}}
> > > >> >
> > > >> > ({ ([c#11,d#22]),([c#33,d#44]) })
> > > >> > ()
> > > >> >
> > > >> > but when I executed
> > > >> >
> > > >> > C = foreach B generate flatten(b);
> > > >> >
> > > >> > dump C;
> > > >> >
> > > >> > I got the exception again...
> > > >> >
> > > >> > 2013-04-17 15:47:39,933 [Thread-26] WARN
> > > >> >  org.apache.hadoop.mapred.LocalJobRunner - job_local_0002
> > > >> > java.lang.Exception: java.lang.ClassCastException:
> > > >> > org.apache.pig.data.DataByteArray cannot be cast to
> > > >> > org.apache.pig.data.DataBag
> > > >> > at
> > > >>
> > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:400)
> > > >> > Caused by: java.lang.ClassCastException:
> > > >> org.apache.pig.data.DataByteArray
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB