Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Avro >> mail # user >> optional enums

Copy link to this message
optional enums
What's the "best" way to represent an optional enum in avro (in terms of
space efficiency, computational efficiency, and readability)?  To be
consistent with other optional fields, I was planning to use union of null
and my enum type.  The other approach I could see was adding a NULL field to
the enum -- but then my code would have to initialize the enum field to null
before a write.

I've tried to use union of null and the enum-type, but I've run into an
issue with this approach when using the AvroOutputFormat.  The following
code summarizes my issue:

  public void testDataWriteWithSchema() throws IOException {
    final DataFileWriter<Event> writer       new DataFileWriter<Event>(new SpecificDatumWriter<Event>());

    writer.create(Event.SCHEMA$, new File("target/datafile-test.avro"));

  public void testDataWriteWithSchemaWithClass() throws IOException {
    final DataFileWriter<Event> writer       new DataFileWriter<Event>(new

    writer.create(Event.SCHEMA$, new File("target/datafile-test.avro"));
When I don't pass in the Event.class to SpecificDatumWriter (the first test
method), the above test fails with the following exception:

Not in union
["null", {"type":"enum","name":"Suit","namespace":"foo","symbols":["SPADES","CLUBS","HEARS","DIAMONDS"]}]:

 at org.apache.avro.generic.GenericData.resolveUnion(GenericData.java:382)

at org.apache.avro.generic.GenericDatumWriter.write(

at org.apache.avro.generic.GenericDatumWriter.writeRecord(

at org.apache.avro.generic.GenericDatumWriter.write(

at org.apache.avro.generic.GenericDatumWriter.write(

at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:245)
AvroOutputFormat uses the SpecificDatumWriter's default c'tor, so I run into
the above exception when using it.  Is there some way around this (other
than implementing my own OutputFormat that passes along the class?).