Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Avro >> mail # user >> Schema exclusion from Avro message

Wai Yip Tung 2014-01-27, 19:01
Copy link to this message
Re: Schema exclusion from Avro message
If you're using Avro's RPC mechanism, schemas are only sent when the
client and server do not already have each other's schema.  Each
client request is preceded by a hash of the clients schema and the
schema it thinks the server is using.  If the server already has the
client's schema, and the client already has the server's, then the
server can directly respond.  If they do not have the other's schema
then schemas are transmitted and cached.  This way the server's schema
is only transmitted for the first request from a given client, and the
client's schema is only transmitted to the server the first time a
client with that schema connects.

Avro Python does support RPC.

If you're not using Avro RPC but some other messaging mechanism, then
AVRO-1124 as you mention might be useful, but it also has not yet been

If you're storing Avro data in a file, then the Schema is included in
the file, as you mention.


On Mon, Jan 27, 2014 at 11:00 AM, Wai Yip Tung <[EMAIL PROTECTED]> wrote:
> I found Deepesh's question back in December. I have joined the mailing list
> later. So don't have the message in my inbox and I do not know the proper
> way to reply. Anyway I have include the original message below.
> I have the similar issue. In addition I'm interested to find out about
> Python and Node js library support.
> From what I understand, the avro specification requires avro.schema. So I am
> quite unsure of the status of have the schema in an external repository.
> -  avro.schema contains the schema of objects stored in the file, as JSON
> data (required).
> http://avro.apache.org/docs/1.7.6/spec.html#Object+Container+Files
> Wai Yip
>> From    Deepesh Malviya <[EMAIL PROTECTED]>
>> Subject    Schema exclusion from Avro message
>> Date    Sun, 15 Dec 2013 12:58:18 GMT
>> Hi,
>> I have read at multiple places that we can exclude the schema being packed
>> into the Avro message & can only include version to allow schema lookup. I
>> have also looked into
>> https://issues.apache.org/jira/browse/AVRO-1124however, didn't found
>> how to make use of such repository while reading or
>> writing Avro messages.
>> I just need some heads up related to that to get started. My use-case is
>> of
>> sending Avro message from a C-based avro client to Flume/Kafka & finally
>> storing it to Hadoop.
>> --
>> _Deepesh

Wai Yip Tung 2014-01-28, 01:09