Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Pig and Avro Error


Copy link to this message
-
Re: Pig and Avro Error
>> java.lang.RuntimeException : Dataum 23.0 is not in union ["null" , "int"]

Given that you're specifying no Avro schema in STORE command, AvroStorage
would derive the output Avro schema based on Pig schema. By default,
AvroStorage converts every primitive type to a nullable union. In this
case, final has an integer field, so AvroStorage converts it to the union
["null", "int"]. According to the error message, one record includes a
float (23.0) instead of integer, and thus, it fails.

I would try to DESCRIBE and DUMP on final and find which column is causing
the mismatch. It's hard to tell what the exact problem is without seeing
your data and schema.

Thanks,
Cheolsoo

On Sun, Jun 9, 2013 at 11:01 AM, abhishek dodda
<[EMAIL PROTECTED]>wrote:

> hi all,
>
> Running pig with avro storage and facing the below issue
>
> pig 0.10 and avro 1.7
> *
> *
> *org.apache.avro.file.DataFileWriter$AppenderWriteException :
> java.lang.RuntimeException : Dataum 23.0 is not in union ["null" , "int"]*
> *
> *
> my pig script does the following
>
> a = load '/user/abhi/abc.txt' using
> org.apache.pig.piggybank.storage.avro.AvroStorage();
>
> b = load '/user/abhi/def.txt' using
> org.apache.pig.piggybank.storage.avro.AvroStorage();
>
> joined = join a  by $0 left outer , b by $0;
>
> final = foreach joined generate
> {
> ...
> ...
> };
>
> store final into '/user/abhi/output' using
> org.apache.pig.piggybank.storage.avro.AvroStorage();
>
> Thanks
> abhishek
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB