Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Avro >> mail # user >> Default field values required for field deletion


Copy link to this message
-
Default field values required for field deletion
Hi,

We're using Avro as the storage format for database records, and schema
evolution is a key feature for us.  I have a question regarding the
deletion of fields from a record, when a schema is changed.

Let's say a field X that is present in v1 of the schema, but does not
define a default value, is deleted in v2 of the schema.  There can be a mix
of v1 and v2 records in the database, and a mix of v1 and v2 client apps
(apps that use v1 or v2 as their writer and reader schema).

If a v1 app reads a v2 record (written by a v2 app), an exception will be
thrown because the reader schema contains field X, the record being
deserialized does not contain field X, and the reader schema does not
contain a default value for field X.

Therefore, our conclusion is that a default value must be defined for each
field in a schema, in order to support deletion of that field from the
schema at a future time.

To delete a field that does not define a default value, the only
possibility would be to upgrade all clients to v2 before using the v2
schema for writing.  This is usually impractical in a large distributed
system.

My question is:  Does this make sense -- have I got it right?

Thanks in advance,
--mark
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB