|
|
-
Getting errors with BinSedesTuple in my storefunc
Jeremy Hanna 2011-04-08, 16:30
I am going through a lot of processing with my data and then I reformat it to go back into my data store using the storefunc. I store it out to hdfs and it visually looks just fine. However when I try to persist it, I'm getting an exception that it can't cast one of the values from org.apache.pig.data.BinSedesTuple to org.apache.pig.data.DataByteArray. I had been assuming that the value would have been a DataByteArray (in my storefunc) and it looks from the javadocs of BinSedesTuple that it's a type only used for intermediate processing. So I'm just wondering - is there any way I should convert this manually or is there something wrong?
Thanks,
Jeremy
-
Re: Getting errors with BinSedesTuple in my storefunc
Thejas M Nair 2011-04-08, 20:35
Bytearray datatype also represents the 'unkown' type. Ie if pig does not know the type for a field, it uses the bytearray type. In such cases the actual object will not be an instance of DataByteArray. I am wondering if in the storefunc, you are casting an 'unkown' type (which happens to be a tuple), into DataByteArray. Can you check if pig is doing the right thing by returning a Tuple in this case ? (BinSedesTuple implements Tuple interface). Thanks, Thejas
On 4/8/11 9:30 AM, "Jeremy Hanna" <[EMAIL PROTECTED]> wrote:
I am going through a lot of processing with my data and then I reformat it to go back into my data store using the storefunc. I store it out to hdfs and it visually looks just fine. However when I try to persist it, I'm getting an exception that it can't cast one of the values from org.apache.pig.data.BinSedesTuple to org.apache.pig.data.DataByteArray. I had been assuming that the value would have been a DataByteArray (in my storefunc) and it looks from the javadocs of BinSedesTuple that it's a type only used for intermediate processing. So I'm just wondering - is there any way I should convert this manually or is there something wrong?
Thanks,
Jeremy
-
Re: Getting errors with BinSedesTuple in my storefunc
Jeremy Hanna 2011-04-08, 22:38
Thejas,
I was being dumb about it and when my UDF returned the set of data I had neglected to define the outgoing schema before sending it to the storefunc. Consequently, it had no schema, so Pig was doing the best it could with the data. (Thanks Dmitriy for pointing that out to me).
Thanks for the help!
Jeremy
On Apr 8, 2011, at 3:35 PM, Thejas M Nair wrote:
> Bytearray datatype also represents the ‘unkown’ type. Ie if pig does not know the type for a field, it uses the bytearray type. In such cases the actual object will not be an instance of DataByteArray. > I am wondering if in the storefunc, you are casting an ‘unkown’ type (which happens to be a tuple), into DataByteArray. Can you check if pig is doing the right thing by returning a Tuple in this case ? (BinSedesTuple implements Tuple interface). > > > Thanks, > Thejas > > > > On 4/8/11 9:30 AM, "Jeremy Hanna" <[EMAIL PROTECTED]> wrote: > > I am going through a lot of processing with my data and then I reformat it to go back into my data store using the storefunc. I store it out to hdfs and it visually looks just fine. However when I try to persist it, I'm getting an exception that it can't cast one of the values from org.apache.pig.data.BinSedesTuple to org.apache.pig.data.DataByteArray. I had been assuming that the value would have been a DataByteArray (in my storefunc) and it looks from the javadocs of BinSedesTuple that it's a type only used for intermediate processing. So I'm just wondering - is there any way I should convert this manually or is there something wrong? > > Thanks, > > Jeremy > >
|
|
All projects made searchable here are trademarks of the Apache Software Foundation.
Service operated by
Sematext