Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Unable to load data using PigStorage that was previously stored using PigStorage


Copy link to this message
-
Re: Unable to load data using PigStorage that was previously stored using PigStorage
Hi Ruslan:

No worries. It is all good. :) I still have a lot to learn about pig.
The jiras you pointed to did clarified my misunderstandings. Thank you for
your help!

Best Regards,

Jerry
On Fri, Apr 19, 2013 at 4:56 PM, Ruslan Al-Fakikh <[EMAIL PROTECTED]>wrote:

> Hi Jerry,
> Sorry I misled you in my suggestions a bit:)
> As for your last question: it was interesting for me to investigate the
> issue. Here is what I found:
> https://issues.apache.org/jira/browse/PIG-2216
> https://issues.apache.org/jira/browse/PIG-2315
> So here
> B = foreach A generate document#'b' as b:bag{};"
> due to the misleading Pig syntax/behaviour you are not casting, just
> renaming:(
>
> Ruslan
>
>
>
> On Fri, Apr 19, 2013 at 2:57 AM, Jerry Lam <[EMAIL PROTECTED]> wrote:
>
> > Hi Prashant:
> >
> > Just trying to understand my mistake...
> > I thought "B = foreach A generate document#'b' as b:bag{};" will cast
> > bytearray to bag because of b:bag{}. If I understand correctly, this is
> not
> > what I thought. Am I correct?
> >
> > Best Regards,
> >
> > Jerry
> >
> >
> >
> >
> > On Thu, Apr 18, 2013 at 5:41 PM, Prashant Kommireddi <
> [EMAIL PROTECTED]
> > >wrote:
> >
> > > Hi Jerry,
> > >
> > > Like I mentioned in my earlier email "Map values by default are
> > bytearrays.
> > > If you need them to be any other type, you would need to define it
> > > explicitly."
> > >
> > > Difference in the 2 statements is one does a cast to "bag" and the
> other
> > is
> > > a bytearray (default).
> > >
> > >
> > >
> > >
> > >
> > >
> > > On Thu, Apr 18, 2013 at 2:14 PM, Jerry Lam <[EMAIL PROTECTED]>
> wrote:
> > >
> > > > Hi Prashant:
> > > >
> > > > IT WORKS! THANKS!
> > > > What is the difference between :
> > > > "B = foreach A generate (bag{})document#'b' as b;
> > > > and
> > > > B = foreach A generate document#'b' as b:bag{};"
> > > > ?
> > > >
> > > > The latter gives error: java.lang.ClassCastException:
> > > > org.apache.pig.data.DataByteArray cannot be cast to
> > > > org.apache.pig.data.DataBag
> > > >
> > > > Best Regards,
> > > >
> > > > Jerry
> > > >
> > > >
> > > > On Thu, Apr 18, 2013 at 12:34 PM, Prashant Kommireddi
> > > > <[EMAIL PROTECTED]>wrote:
> > > >
> > > > > Well, let me rephrase - the values all have to be the same type if
> > you
> > > > > choose to read all columns in a similar way. If you know in advance
> > its
> > > > > always the value associated with key 'b' that's a bag, why don't
> you
> > > cast
> > > > > that single value?
> > > > >
> > > > > B = foreach A generate (bag{})document#'b' as b;
> > > > >
> > > > >
> > > > > On Thu, Apr 18, 2013 at 7:43 AM, Jerry Lam <[EMAIL PROTECTED]>
> > > wrote:
> > > > >
> > > > > > Hi Prashant:
> > > > > >
> > > > > > I read about the map data type in the book "Programming Pig", it
> > > says:
> > > > > > "... By default there is no requirement that all values in a map
> > must
> > > > be
> > > > > of
> > > > > > the same type. It is legitimate to have a map with two keys name
> > and
> > > > age,
> > > > > > where the value for name is a chararray and the value for age is
> an
> > > > int.
> > > > > > Beginning in Pig 0.9, a map can declare its values to all be of
> the
> > > > same
> > > > > > type... "
> > > > > >
> > > > > > I agree that all values in the map can be of the same type but
> this
> > > is
> > > > > not
> > > > > > required in pig.
> > > > > >
> > > > > > Best Regards,
> > > > > >
> > > > > > Jerry
> > > > > >
> > > > > >
> > > > > > On Thu, Apr 18, 2013 at 10:37 AM, Jerry Lam <
> [EMAIL PROTECTED]>
> > > > > wrote:
> > > > > >
> > > > > > > Hi Rusian:
> > > > > > >
> > > > > > > I used PigStorage to store the data that is originally using
> Pig
> > > data
> > > > > > > type. It is strange (or a bug in Pig) that I cannot read the
> data
> > > > using
> > > > > > > PigStorage that have been stored using PigStorage, isn't it?
> > > > > > >
> > > > > > > Best Regards,
> > > > > > >
> > > > > > > Jerry
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Wed, Apr 17, 2013 at 10:52 PM, Ruslan Al-Fakikh <
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB