Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> Unable to load data using PigStorage that was previously stored using PigStorage


+
Jerry Lam 2013-04-17, 01:28
+
Ruslan Al-Fakikh 2013-04-17, 15:57
+
Jerry Lam 2013-04-17, 17:11
+
Ruslan Al-Fakikh 2013-04-17, 17:22
+
Ruslan Al-Fakikh 2013-04-17, 17:24
+
Jerry Lam 2013-04-17, 17:38
+
Ruslan Al-Fakikh 2013-04-17, 19:26
+
Jerry Lam 2013-04-17, 19:48
+
Ruslan Al-Fakikh 2013-04-18, 02:52
+
Jerry Lam 2013-04-18, 14:37
+
Jerry Lam 2013-04-18, 14:43
+
Prashant Kommireddi 2013-04-18, 16:34
+
Jerry Lam 2013-04-18, 21:14
+
Prashant Kommireddi 2013-04-18, 21:41
Copy link to this message
-
Re: Unable to load data using PigStorage that was previously stored using PigStorage
Hi Prashant:

Just trying to understand my mistake...
I thought "B = foreach A generate document#'b' as b:bag{};" will cast
bytearray to bag because of b:bag{}. If I understand correctly, this is not
what I thought. Am I correct?

Best Regards,

Jerry
On Thu, Apr 18, 2013 at 5:41 PM, Prashant Kommireddi <[EMAIL PROTECTED]>wrote:

> Hi Jerry,
>
> Like I mentioned in my earlier email "Map values by default are bytearrays.
> If you need them to be any other type, you would need to define it
> explicitly."
>
> Difference in the 2 statements is one does a cast to "bag" and the other is
> a bytearray (default).
>
>
>
>
>
>
> On Thu, Apr 18, 2013 at 2:14 PM, Jerry Lam <[EMAIL PROTECTED]> wrote:
>
> > Hi Prashant:
> >
> > IT WORKS! THANKS!
> > What is the difference between :
> > "B = foreach A generate (bag{})document#'b' as b;
> > and
> > B = foreach A generate document#'b' as b:bag{};"
> > ?
> >
> > The latter gives error: java.lang.ClassCastException:
> > org.apache.pig.data.DataByteArray cannot be cast to
> > org.apache.pig.data.DataBag
> >
> > Best Regards,
> >
> > Jerry
> >
> >
> > On Thu, Apr 18, 2013 at 12:34 PM, Prashant Kommireddi
> > <[EMAIL PROTECTED]>wrote:
> >
> > > Well, let me rephrase - the values all have to be the same type if you
> > > choose to read all columns in a similar way. If you know in advance its
> > > always the value associated with key 'b' that's a bag, why don't you
> cast
> > > that single value?
> > >
> > > B = foreach A generate (bag{})document#'b' as b;
> > >
> > >
> > > On Thu, Apr 18, 2013 at 7:43 AM, Jerry Lam <[EMAIL PROTECTED]>
> wrote:
> > >
> > > > Hi Prashant:
> > > >
> > > > I read about the map data type in the book "Programming Pig", it
> says:
> > > > "... By default there is no requirement that all values in a map must
> > be
> > > of
> > > > the same type. It is legitimate to have a map with two keys name and
> > age,
> > > > where the value for name is a chararray and the value for age is an
> > int.
> > > > Beginning in Pig 0.9, a map can declare its values to all be of the
> > same
> > > > type... "
> > > >
> > > > I agree that all values in the map can be of the same type but this
> is
> > > not
> > > > required in pig.
> > > >
> > > > Best Regards,
> > > >
> > > > Jerry
> > > >
> > > >
> > > > On Thu, Apr 18, 2013 at 10:37 AM, Jerry Lam <[EMAIL PROTECTED]>
> > > wrote:
> > > >
> > > > > Hi Rusian:
> > > > >
> > > > > I used PigStorage to store the data that is originally using Pig
> data
> > > > > type. It is strange (or a bug in Pig) that I cannot read the data
> > using
> > > > > PigStorage that have been stored using PigStorage, isn't it?
> > > > >
> > > > > Best Regards,
> > > > >
> > > > > Jerry
> > > > >
> > > > >
> > > > >
> > > > > On Wed, Apr 17, 2013 at 10:52 PM, Ruslan Al-Fakikh <
> > > [EMAIL PROTECTED]
> > > > >wrote:
> > > > >
> > > > >> The output:
> > > > >> ({ ([c#11,d#22]),([c#33,d#44]) })
> > > > >> ()
> > > > >> looks weird.
> > > > >>
> > > > >> Jerry, maybe the problem is in using PigStorage. As its javadoc
> > says:
> > > > >>
> > > > >> A load function that parses a line of input into fields using a
> > > > character
> > > > >> delimiter
> > > > >>
> > > > >> So I guess this is just for simple csv lines.
> > > > >> But you are trying to load a complicated Map structure as it was
> > > > formatted
> > > > >> by previous storing.
> > > > >> Probably you'll need to write your own Loader for this. Another
> > hint:
> > > > >> using
> > > > >> the -schema paramenter to PigStorage, but I am not sure it can
> > help:(
> > > > >>
> > > > >> Ruslan
> > > > >>
> > > > >>
> > > > >> On Wed, Apr 17, 2013 at 11:48 PM, Jerry Lam <[EMAIL PROTECTED]
> >
> > > > wrote:
> > > > >>
> > > > >> > Hi Rusian:
> > > > >> >
> > > > >> > I did a describe B followed by a dump B, the output is:
> > > > >> > B: {b: {()}}
> > > > >> >
> > > > >> > ({ ([c#11,d#22]),([c#33,d#44]) })
> > > > >> > ()
> > > > >> >
> > > > >> > but when I executed
> > > > >> >
> > >
+
Ruslan Al-Fakikh 2013-04-19, 20:56
+
Jerry Lam 2013-04-20, 00:39
+
Prashant Kommireddi 2013-04-18, 06:56