Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> Versioning Schema's

Copy link to this message
Re: Versioning Schema's
Actually, currently our schema id is the md5 of the schema itself. Not
fully sure how this compares with an explicit version field in the schema.


On Wed, Jun 12, 2013 at 8:29 AM, Jun Rao <[EMAIL PROTECTED]> wrote:

> At LinkedIn, we are using option 2.
> Thanks,
> Jun
> On Wed, Jun 12, 2013 at 7:14 AM, Shone Sadler <[EMAIL PROTECTED]>wrote:
>> Hello everyone,
>> After doing some searching on the mailing list for best practices on
>> integrating Avro with Kafka there appears to be at least 3 options for
>> integrating the Avro Schema; 1) embedding the entire schema within the
>> message 2) embedding a unique identifier for the schema in the message and
>> 3) deriving the schema from the topic/resource name.
>> Option 2, appears to be the best option in terms of both efficiency and
>> flexibility.  However, from a programming perspective it complicates the
>> solution with the need for both an envelope schema (containing a "schema
>> id" and "bytes" field for record data) and message schema (containing the
>> application specific message fields).  This requires two levels of
>> serialization/deserialization.
>> Questions:
>> 1) How are others dealing with versioning of schemas?
>> 2) Is there a more elegant means of embedding a schema ids in a Avro
>> message (I am new to both currently ;-)?
>> Thanks in advance!
>> Shone