Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Avro, mail # dev - Question about enum usage with GenericData


+
Ken Krugler 2010-11-15, 00:08
Copy link to this message
-
Question about enum usage with GenericData
Ken Krugler 2010-11-15, 01:01
Answering my own question...

I needed to pass an object to GenericData.Record.put(fieldName,  
object) that implements the GenericEnumSymbol interface.

-- Ken

=================================================================
I've been looking at adding enum support to the Cascading AvroScheme,  
but I'm perplexed by an error I run into.

I create an Avro Schema based on an enum class passed in from  
Cascading - the result in Json looks like:

{"type":"record","name":"CascadingAvroSchema","namespace":"","fields":
[{"name":"a","type":["null",{"type":"enum","name":"AvroSchemeTest
$TestEnum","namespace":"com.bixolabs.cascading.avro","symbols":
["ONE","TWO"]}],"doc":""}]}

I'm writing out the data using code that does this:

         GenericData.Record datum = new GenericData.Record(getSchema());
datum.put(fieldName, object);

Where object is set to TestEnum.ONE

But when I try to write out the result using the AvroOutputFormat, I  
get:

Caused by: org.apache.avro.AvroRuntimeException: Not in union ["null",
{"type":"enum","name":"AvroSchemeTest
$TestEnum","namespace":"com.bixolabs.cascading.avro","symbols":
["ONE","TWO"]}]: ONE
at org.apache.avro.generic.GenericData.resolveUnion(GenericData.java:
372)
at  
org
.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:
69)
at  
org
.apache
.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:
102)
at  
org
.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:
64)
at  
org
.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:
56)
at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:245)
at org.apache.avro.mapred.AvroOutputFormat
$1.write(AvroOutputFormat.java:93)
at org.apache.avro.mapred.AvroOutputFormat
$1.write(AvroOutputFormat.java:90)

Any input into what I'm doing wrong? Since I'm dynamically creating  
the schema based on the Cascading Tuple fields/types, I'm guessing my  
schema isn't appropriate.

Thanks,

-- Ken

--------------------------
Ken Krugler
+1 530-210-6378
http://bixolabs.com
e l a s t i c   w e b   m i n i n g