Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Avro >> mail # user >> Default field values required for field deletion

Copy link to this message
Default field values required for field deletion

We're using Avro as the storage format for database records, and schema
evolution is a key feature for us.  I have a question regarding the
deletion of fields from a record, when a schema is changed.

Let's say a field X that is present in v1 of the schema, but does not
define a default value, is deleted in v2 of the schema.  There can be a mix
of v1 and v2 records in the database, and a mix of v1 and v2 client apps
(apps that use v1 or v2 as their writer and reader schema).

If a v1 app reads a v2 record (written by a v2 app), an exception will be
thrown because the reader schema contains field X, the record being
deserialized does not contain field X, and the reader schema does not
contain a default value for field X.

Therefore, our conclusion is that a default value must be defined for each
field in a schema, in order to support deletion of that field from the
schema at a future time.

To delete a field that does not define a default value, the only
possibility would be to upgrade all clients to v2 before using the v2
schema for writing.  This is usually impractical in a large distributed

My question is:  Does this make sense -- have I got it right?

Thanks in advance,
Doug Cutting 2012-10-01, 19:06
Mark Hayes 2012-10-01, 19:25