Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig, mail # user - Pig and Avro Error


+
abhishek dodda 2013-06-09, 18:01
Copy link to this message
-
Re: Pig and Avro Error
Cheolsoo Park 2013-06-09, 19:26
>> java.lang.RuntimeException : Dataum 23.0 is not in union ["null" , "int"]

Given that you're specifying no Avro schema in STORE command, AvroStorage
would derive the output Avro schema based on Pig schema. By default,
AvroStorage converts every primitive type to a nullable union. In this
case, final has an integer field, so AvroStorage converts it to the union
["null", "int"]. According to the error message, one record includes a
float (23.0) instead of integer, and thus, it fails.

I would try to DESCRIBE and DUMP on final and find which column is causing
the mismatch. It's hard to tell what the exact problem is without seeing
your data and schema.

Thanks,
Cheolsoo

On Sun, Jun 9, 2013 at 11:01 AM, abhishek dodda
<[EMAIL PROTECTED]>wrote:

> hi all,
>
> Running pig with avro storage and facing the below issue
>
> pig 0.10 and avro 1.7
> *
> *
> *org.apache.avro.file.DataFileWriter$AppenderWriteException :
> java.lang.RuntimeException : Dataum 23.0 is not in union ["null" , "int"]*
> *
> *
> my pig script does the following
>
> a = load '/user/abhi/abc.txt' using
> org.apache.pig.piggybank.storage.avro.AvroStorage();
>
> b = load '/user/abhi/def.txt' using
> org.apache.pig.piggybank.storage.avro.AvroStorage();
>
> joined = join a  by $0 left outer , b by $0;
>
> final = foreach joined generate
> {
> ...
> ...
> };
>
> store final into '/user/abhi/output' using
> org.apache.pig.piggybank.storage.avro.AvroStorage();
>
> Thanks
> abhishek
>
+
abhishek dodda 2013-06-09, 19:30