Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Avro >> mail # dev >> Question about enum usage with GenericData


+
Ken Krugler 2010-11-15, 00:08
Copy link to this message
-
Question about enum usage with GenericData
Answering my own question...

I needed to pass an object to GenericData.Record.put(fieldName,  
object) that implements the GenericEnumSymbol interface.

-- Ken

=================================================================
I've been looking at adding enum support to the Cascading AvroScheme,  
but I'm perplexed by an error I run into.

I create an Avro Schema based on an enum class passed in from  
Cascading - the result in Json looks like:

{"type":"record","name":"CascadingAvroSchema","namespace":"","fields":
[{"name":"a","type":["null",{"type":"enum","name":"AvroSchemeTest
$TestEnum","namespace":"com.bixolabs.cascading.avro","symbols":
["ONE","TWO"]}],"doc":""}]}

I'm writing out the data using code that does this:

         GenericData.Record datum = new GenericData.Record(getSchema());
datum.put(fieldName, object);

Where object is set to TestEnum.ONE

But when I try to write out the result using the AvroOutputFormat, I  
get:

Caused by: org.apache.avro.AvroRuntimeException: Not in union ["null",
{"type":"enum","name":"AvroSchemeTest
$TestEnum","namespace":"com.bixolabs.cascading.avro","symbols":
["ONE","TWO"]}]: ONE
at org.apache.avro.generic.GenericData.resolveUnion(GenericData.java:
372)
at  
org
.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:
69)
at  
org
.apache
.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:
102)
at  
org
.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:
64)
at  
org
.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:
56)
at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:245)
at org.apache.avro.mapred.AvroOutputFormat
$1.write(AvroOutputFormat.java:93)
at org.apache.avro.mapred.AvroOutputFormat
$1.write(AvroOutputFormat.java:90)

Any input into what I'm doing wrong? Since I'm dynamically creating  
the schema based on the Cascading Tuple fields/types, I'm guessing my  
schema isn't appropriate.

Thanks,

-- Ken

--------------------------
Ken Krugler
+1 530-210-6378
http://bixolabs.com
e l a s t i c   w e b   m i n i n g

NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB