Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - Unable to load data using PigStorage that was previously stored using PigStorage


Copy link to this message
-
Re: Unable to load data using PigStorage that was previously stored using PigStorage
Jerry Lam 2013-04-20, 00:39
Hi Ruslan:

No worries. It is all good. :) I still have a lot to learn about pig.
The jiras you pointed to did clarified my misunderstandings. Thank you for
your help!

Best Regards,

Jerry
On Fri, Apr 19, 2013 at 4:56 PM, Ruslan Al-Fakikh <[EMAIL PROTECTED]>wrote:

> Hi Jerry,
> Sorry I misled you in my suggestions a bit:)
> As for your last question: it was interesting for me to investigate the
> issue. Here is what I found:
> https://issues.apache.org/jira/browse/PIG-2216
> https://issues.apache.org/jira/browse/PIG-2315
> So here
> B = foreach A generate document#'b' as b:bag{};"
> due to the misleading Pig syntax/behaviour you are not casting, just
> renaming:(
>
> Ruslan
>
>
>
> On Fri, Apr 19, 2013 at 2:57 AM, Jerry Lam <[EMAIL PROTECTED]> wrote:
>
> > Hi Prashant:
> >
> > Just trying to understand my mistake...
> > I thought "B = foreach A generate document#'b' as b:bag{};" will cast
> > bytearray to bag because of b:bag{}. If I understand correctly, this is
> not
> > what I thought. Am I correct?
> >
> > Best Regards,
> >
> > Jerry
> >
> >
> >
> >
> > On Thu, Apr 18, 2013 at 5:41 PM, Prashant Kommireddi <
> [EMAIL PROTECTED]
> > >wrote:
> >
> > > Hi Jerry,
> > >
> > > Like I mentioned in my earlier email "Map values by default are
> > bytearrays.
> > > If you need them to be any other type, you would need to define it
> > > explicitly."
> > >
> > > Difference in the 2 statements is one does a cast to "bag" and the
> other
> > is
> > > a bytearray (default).
> > >
> > >
> > >
> > >
> > >
> > >
> > > On Thu, Apr 18, 2013 at 2:14 PM, Jerry Lam <[EMAIL PROTECTED]>
> wrote:
> > >
> > > > Hi Prashant:
> > > >
> > > > IT WORKS! THANKS!
> > > > What is the difference between :
> > > > "B = foreach A generate (bag{})document#'b' as b;
> > > > and
> > > > B = foreach A generate document#'b' as b:bag{};"
> > > > ?
> > > >
> > > > The latter gives error: java.lang.ClassCastException:
> > > > org.apache.pig.data.DataByteArray cannot be cast to
> > > > org.apache.pig.data.DataBag
> > > >
> > > > Best Regards,
> > > >
> > > > Jerry
> > > >
> > > >
> > > > On Thu, Apr 18, 2013 at 12:34 PM, Prashant Kommireddi
> > > > <[EMAIL PROTECTED]>wrote:
> > > >
> > > > > Well, let me rephrase - the values all have to be the same type if
> > you
> > > > > choose to read all columns in a similar way. If you know in advance
> > its
> > > > > always the value associated with key 'b' that's a bag, why don't
> you
> > > cast
> > > > > that single value?
> > > > >
> > > > > B = foreach A generate (bag{})document#'b' as b;
> > > > >
> > > > >
> > > > > On Thu, Apr 18, 2013 at 7:43 AM, Jerry Lam <[EMAIL PROTECTED]>
> > > wrote:
> > > > >
> > > > > > Hi Prashant:
> > > > > >
> > > > > > I read about the map data type in the book "Programming Pig", it
> > > says:
> > > > > > "... By default there is no requirement that all values in a map
> > must
> > > > be
> > > > > of
> > > > > > the same type. It is legitimate to have a map with two keys name
> > and
> > > > age,
> > > > > > where the value for name is a chararray and the value for age is
> an
> > > > int.
> > > > > > Beginning in Pig 0.9, a map can declare its values to all be of
> the
> > > > same
> > > > > > type... "
> > > > > >
> > > > > > I agree that all values in the map can be of the same type but
> this
> > > is
> > > > > not
> > > > > > required in pig.
> > > > > >
> > > > > > Best Regards,
> > > > > >
> > > > > > Jerry
> > > > > >
> > > > > >
> > > > > > On Thu, Apr 18, 2013 at 10:37 AM, Jerry Lam <
> [EMAIL PROTECTED]>
> > > > > wrote:
> > > > > >
> > > > > > > Hi Rusian:
> > > > > > >
> > > > > > > I used PigStorage to store the data that is originally using
> Pig
> > > data
> > > > > > > type. It is strange (or a bug in Pig) that I cannot read the
> data
> > > > using
> > > > > > > PigStorage that have been stored using PigStorage, isn't it?
> > > > > > >
> > > > > > > Best Regards,
> > > > > > >
> > > > > > > Jerry
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Wed, Apr 17, 2013 at 10:52 PM, Ruslan Al-Fakikh <