Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Avro, mail # user - Default field values required for field deletion

Copy link to this message
Default field values required for field deletion
Mark Hayes 2012-10-01, 16:28

We're using Avro as the storage format for database records, and schema
evolution is a key feature for us.  I have a question regarding the
deletion of fields from a record, when a schema is changed.

Let's say a field X that is present in v1 of the schema, but does not
define a default value, is deleted in v2 of the schema.  There can be a mix
of v1 and v2 records in the database, and a mix of v1 and v2 client apps
(apps that use v1 or v2 as their writer and reader schema).

If a v1 app reads a v2 record (written by a v2 app), an exception will be
thrown because the reader schema contains field X, the record being
deserialized does not contain field X, and the reader schema does not
contain a default value for field X.

Therefore, our conclusion is that a default value must be defined for each
field in a schema, in order to support deletion of that field from the
schema at a future time.

To delete a field that does not define a default value, the only
possibility would be to upgrade all clients to v2 before using the v2
schema for writing.  This is usually impractical in a large distributed

My question is:  Does this make sense -- have I got it right?

Thanks in advance,