|
|
-
Type mismatch in key from map
shan s 2012-04-10, 19:03
I am currently getting “Type mismatch in key from map: expected org.apache.pig.impl.io.NullableBytesWritable, recieved org.apache.pig.impl.io.NullableText “ I looked up the PIG-919 and related comments, but could not understand the reason or the workaround for this problem.
Could you please kindly explain this further?
I am getting this even before my GROUP, when I do my 3 way JOIN.
A1 = JOIN AA BY rid, BB BY rid;
A2 = JOIN A1 BY BB::cid, CC by cid;
DESCRIBE A2;
A3 = FOREACH A2 GENERATE FLATTEN((TOTUPLE(BB::rid)));
DESCRIBE A3;
DUMP A3;
DESCRIBE looks like below.
A2: {A1::AA::rid: bytearray,A1::AA::roname: bytearray,A1::AA::asid: bytearray,A1::AA::asname: bytearray,A1::BB::rid: bytearray,A1::BB::roname: bytearray,A1::BB::cid: bytearray,A1::BB::csname: bytearray,CC::cid: bytearray,CC::csname: bytearray,CC::chid: bytearray,CC::chname: bytearray}
A3: {org.apache.pig.builtin.totuple_A1::BB::rid_3::A1::BB::rid: bytearray}
If map is a problem, I tried to convert it to tuple (For A3) above, but it still does not work, in fact A3 still describes it as map (with a {}, I guess) Why is that?
Appreciate your help! Thanks!!
+
shan s 2012-04-10, 19:03
-
Re: Type mismatch in key from map
Dmitriy Ryaboy 2012-04-10, 22:39
What type do you expect rid to be? Where did AA and BB come from?
D
On Tue, Apr 10, 2012 at 12:03 PM, shan s <[EMAIL PROTECTED]> wrote: > I am currently getting “Type mismatch in key from map: expected > org.apache.pig.impl.io.NullableBytesWritable, recieved > org.apache.pig.impl.io.NullableText “ > > > I looked up the PIG-919 and related comments, but could not understand the > reason or the workaround for this problem. > > Could you please kindly explain this further? > > > > I am getting this even before my GROUP, when I do my 3 way JOIN. > > > > A1 = JOIN AA BY rid, BB BY rid; > > A2 = JOIN A1 BY BB::cid, CC by cid; > > DESCRIBE A2; > > A3 = FOREACH A2 GENERATE FLATTEN((TOTUPLE(BB::rid))); > > DESCRIBE A3; > > DUMP A3; > > > > > > DESCRIBE looks like below. > > > > A2: {A1::AA::rid: bytearray,A1::AA::roname: bytearray,A1::AA::asid: > bytearray,A1::AA::asname: bytearray,A1::BB::rid: bytearray,A1::BB::roname: > bytearray,A1::BB::cid: bytearray,A1::BB::csname: bytearray,CC::cid: > bytearray,CC::csname: bytearray,CC::chid: bytearray,CC::chname: bytearray} > > A3: {org.apache.pig.builtin.totuple_A1::BB::rid_3::A1::BB::rid: bytearray} > > > > > > If map is a problem, I tried to convert it to tuple (For A3) above, but it > still does not work, in fact A3 still describes it as map (with a {}, I > guess) Why is that? > > > > Appreciate your help! Thanks!!
+
Dmitriy Ryaboy 2012-04-10, 22:39
-
Re: Type mismatch in key from map
shan s 2012-04-11, 03:15
When I load my data I defined all fields to be chararray in the schema. I can afford to treat everything as chararray.
rid cold be chararray. ( but no real expectations from my side, it's a guid from coming from db) AA and BB do come from UDF, UDF does some string processing and returns substrings as tuples. Also when I tried to convert the rid to chararray in A3, I get an error, "can't convert to chararray." without further explanation.
Thank You.... On Wed, Apr 11, 2012 at 4:09 AM, Dmitriy Ryaboy <[EMAIL PROTECTED]> wrote:
> What type do you expect rid to be? > Where did AA and BB come from? > > D > > On Tue, Apr 10, 2012 at 12:03 PM, shan s <[EMAIL PROTECTED]> wrote: > > I am currently getting “Type mismatch in key from map: expected > > org.apache.pig.impl.io.NullableBytesWritable, recieved > > org.apache.pig.impl.io.NullableText “ > > > > > > I looked up the PIG-919 and related comments, but could not understand > the > > reason or the workaround for this problem. > > > > Could you please kindly explain this further? > > > > > > > > I am getting this even before my GROUP, when I do my 3 way JOIN. > > > > > > > > A1 = JOIN AA BY rid, BB BY rid; > > > > A2 = JOIN A1 BY BB::cid, CC by cid; > > > > DESCRIBE A2; > > > > A3 = FOREACH A2 GENERATE FLATTEN((TOTUPLE(BB::rid))); > > > > DESCRIBE A3; > > > > DUMP A3; > > > > > > > > > > > > DESCRIBE looks like below. > > > > > > > > A2: {A1::AA::rid: bytearray,A1::AA::roname: bytearray,A1::AA::asid: > > bytearray,A1::AA::asname: bytearray,A1::BB::rid: > bytearray,A1::BB::roname: > > bytearray,A1::BB::cid: bytearray,A1::BB::csname: bytearray,CC::cid: > > bytearray,CC::csname: bytearray,CC::chid: bytearray,CC::chname: > bytearray} > > > > A3: {org.apache.pig.builtin.totuple_A1::BB::rid_3::A1::BB::rid: > bytearray} > > > > > > > > > > > > If map is a problem, I tried to convert it to tuple (For A3) above, but > it > > still does not work, in fact A3 still describes it as map (with a {}, I > > guess) Why is that? > > > > > > > > Appreciate your help! Thanks!! >
+
shan s 2012-04-11, 03:15
-
Re: Type mismatch in key from map
shan s 2012-04-11, 14:09
Hi Dmitriy It works after explicit casting to chararray. So does it mean a bytearray field can't be used in JOIN or is there more to it? How to explain this behaviour ?
Thanks! On Wed, Apr 11, 2012 at 8:45 AM, shan s <[EMAIL PROTECTED]> wrote:
> When I load my data I defined all fields to be chararray in the schema. I > can afford to treat everything as chararray. > > rid cold be chararray. ( but no real expectations from my side, it's a > guid from coming from db) > AA and BB do come from UDF, UDF does some string processing and > returns substrings as tuples. > Also when I tried to convert the rid to chararray in A3, I get an error, > "can't convert to chararray." without further explanation. > > Thank You.... > On Wed, Apr 11, 2012 at 4:09 AM, Dmitriy Ryaboy <[EMAIL PROTECTED]>wrote: > >> What type do you expect rid to be? >> Where did AA and BB come from? >> >> D >> >> On Tue, Apr 10, 2012 at 12:03 PM, shan s <[EMAIL PROTECTED]> wrote: >> > I am currently getting “Type mismatch in key from map: expected >> > org.apache.pig.impl.io.NullableBytesWritable, recieved >> > org.apache.pig.impl.io.NullableText “ >> > >> > >> > I looked up the PIG-919 and related comments, but could not understand >> the >> > reason or the workaround for this problem. >> > >> > Could you please kindly explain this further? >> > >> > >> > >> > I am getting this even before my GROUP, when I do my 3 way JOIN. >> > >> > >> > >> > A1 = JOIN AA BY rid, BB BY rid; >> > >> > A2 = JOIN A1 BY BB::cid, CC by cid; >> > >> > DESCRIBE A2; >> > >> > A3 = FOREACH A2 GENERATE FLATTEN((TOTUPLE(BB::rid))); >> > >> > DESCRIBE A3; >> > >> > DUMP A3; >> > >> > >> > >> > >> > >> > DESCRIBE looks like below. >> > >> > >> > >> > A2: {A1::AA::rid: bytearray,A1::AA::roname: bytearray,A1::AA::asid: >> > bytearray,A1::AA::asname: bytearray,A1::BB::rid: >> bytearray,A1::BB::roname: >> > bytearray,A1::BB::cid: bytearray,A1::BB::csname: bytearray,CC::cid: >> > bytearray,CC::csname: bytearray,CC::chid: bytearray,CC::chname: >> bytearray} >> > >> > A3: {org.apache.pig.builtin.totuple_A1::BB::rid_3::A1::BB::rid: >> bytearray} >> > >> > >> > >> > >> > >> > If map is a problem, I tried to convert it to tuple (For A3) above, >> but it >> > still does not work, in fact A3 still describes it as map (with a {}, I >> > guess) Why is that? >> > >> > >> > >> > Appreciate your help! Thanks!! >> > >
+
shan s 2012-04-11, 14:09
-
Re: Type mismatch in key from map
Dmitriy Ryaboy 2012-04-11, 15:25
No, you can join on bytearrays. What can't be done is have pig thinking you are joining on bytearrays when you are actually using strings under the covers -- that's what causes the error you are seeing.
On Wed, Apr 11, 2012 at 7:09 AM, shan s <[EMAIL PROTECTED]> wrote: > Hi Dmitriy > It works after explicit casting to chararray. > So does it mean a bytearray field can't be used in JOIN or is there more to > it? > How to explain this behaviour ? > > Thanks! > On Wed, Apr 11, 2012 at 8:45 AM, shan s <[EMAIL PROTECTED]> wrote: > >> When I load my data I defined all fields to be chararray in the schema. I >> can afford to treat everything as chararray. >> >> rid cold be chararray. ( but no real expectations from my side, it's a >> guid from coming from db) >> AA and BB do come from UDF, UDF does some string processing and >> returns substrings as tuples. >> Also when I tried to convert the rid to chararray in A3, I get an error, >> "can't convert to chararray." without further explanation. >> >> Thank You.... >> On Wed, Apr 11, 2012 at 4:09 AM, Dmitriy Ryaboy <[EMAIL PROTECTED]>wrote: >> >>> What type do you expect rid to be? >>> Where did AA and BB come from? >>> >>> D >>> >>> On Tue, Apr 10, 2012 at 12:03 PM, shan s <[EMAIL PROTECTED]> wrote: >>> > I am currently getting “Type mismatch in key from map: expected >>> > org.apache.pig.impl.io.NullableBytesWritable, recieved >>> > org.apache.pig.impl.io.NullableText “ >>> > >>> > >>> > I looked up the PIG-919 and related comments, but could not understand >>> the >>> > reason or the workaround for this problem. >>> > >>> > Could you please kindly explain this further? >>> > >>> > >>> > >>> > I am getting this even before my GROUP, when I do my 3 way JOIN. >>> > >>> > >>> > >>> > A1 = JOIN AA BY rid, BB BY rid; >>> > >>> > A2 = JOIN A1 BY BB::cid, CC by cid; >>> > >>> > DESCRIBE A2; >>> > >>> > A3 = FOREACH A2 GENERATE FLATTEN((TOTUPLE(BB::rid))); >>> > >>> > DESCRIBE A3; >>> > >>> > DUMP A3; >>> > >>> > >>> > >>> > >>> > >>> > DESCRIBE looks like below. >>> > >>> > >>> > >>> > A2: {A1::AA::rid: bytearray,A1::AA::roname: bytearray,A1::AA::asid: >>> > bytearray,A1::AA::asname: bytearray,A1::BB::rid: >>> bytearray,A1::BB::roname: >>> > bytearray,A1::BB::cid: bytearray,A1::BB::csname: bytearray,CC::cid: >>> > bytearray,CC::csname: bytearray,CC::chid: bytearray,CC::chname: >>> bytearray} >>> > >>> > A3: {org.apache.pig.builtin.totuple_A1::BB::rid_3::A1::BB::rid: >>> bytearray} >>> > >>> > >>> > >>> > >>> > >>> > If map is a problem, I tried to convert it to tuple (For A3) above, >>> but it >>> > still does not work, in fact A3 still describes it as map (with a {}, I >>> > guess) Why is that? >>> > >>> > >>> > >>> > Appreciate your help! Thanks!! >>> >> >>
+
Dmitriy Ryaboy 2012-04-11, 15:25
-
Re: Type mismatch in key from map
Jon Coveney 2012-04-11, 14:36
For scalar projection to work, you have to explicitly cast the one line relation to a scalar value. This is to make sure that I is clear what is going on in the script, as accidentally projecting a relation (usually in a group b situation) is common and we want the parser to fail instead of doing an unexpected scalar projection.
On Apr 11, 2012, at 7:09 AM, shan s <[EMAIL PROTECTED]> wrote:
> Hi Dmitriy > It works after explicit casting to chararray. > So does it mean a bytearray field can't be used in JOIN or is there more to > it? > How to explain this behaviour ? > > Thanks! > On Wed, Apr 11, 2012 at 8:45 AM, shan s <[EMAIL PROTECTED]> wrote: > >> When I load my data I defined all fields to be chararray in the schema. I >> can afford to treat everything as chararray. >> >> rid cold be chararray. ( but no real expectations from my side, it's a >> guid from coming from db) >> AA and BB do come from UDF, UDF does some string processing and >> returns substrings as tuples. >> Also when I tried to convert the rid to chararray in A3, I get an error, >> "can't convert to chararray." without further explanation. >> >> Thank You.... >> On Wed, Apr 11, 2012 at 4:09 AM, Dmitriy Ryaboy <[EMAIL PROTECTED]>wrote: >> >>> What type do you expect rid to be? >>> Where did AA and BB come from? >>> >>> D >>> >>> On Tue, Apr 10, 2012 at 12:03 PM, shan s <[EMAIL PROTECTED]> wrote: >>>> I am currently getting “Type mismatch in key from map: expected >>>> org.apache.pig.impl.io.NullableBytesWritable, recieved >>>> org.apache.pig.impl.io.NullableText “ >>>> >>>> >>>> I looked up the PIG-919 and related comments, but could not understand >>> the >>>> reason or the workaround for this problem. >>>> >>>> Could you please kindly explain this further? >>>> >>>> >>>> >>>> I am getting this even before my GROUP, when I do my 3 way JOIN. >>>> >>>> >>>> >>>> A1 = JOIN AA BY rid, BB BY rid; >>>> >>>> A2 = JOIN A1 BY BB::cid, CC by cid; >>>> >>>> DESCRIBE A2; >>>> >>>> A3 = FOREACH A2 GENERATE FLATTEN((TOTUPLE(BB::rid))); >>>> >>>> DESCRIBE A3; >>>> >>>> DUMP A3; >>>> >>>> >>>> >>>> >>>> >>>> DESCRIBE looks like below. >>>> >>>> >>>> >>>> A2: {A1::AA::rid: bytearray,A1::AA::roname: bytearray,A1::AA::asid: >>>> bytearray,A1::AA::asname: bytearray,A1::BB::rid: >>> bytearray,A1::BB::roname: >>>> bytearray,A1::BB::cid: bytearray,A1::BB::csname: bytearray,CC::cid: >>>> bytearray,CC::csname: bytearray,CC::chid: bytearray,CC::chname: >>> bytearray} >>>> >>>> A3: {org.apache.pig.builtin.totuple_A1::BB::rid_3::A1::BB::rid: >>> bytearray} >>>> >>>> >>>> >>>> >>>> >>>> If map is a problem, I tried to convert it to tuple (For A3) above, >>> but it >>>> still does not work, in fact A3 still describes it as map (with a {}, I >>>> guess) Why is that? >>>> >>>> >>>> >>>> Appreciate your help! Thanks!! >>> >> >>
+
Jon Coveney 2012-04-11, 14:36
|
|