Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Avro, mail # dev - clarifications on file format

Scott Banachowski 2010-04-01, 00:23
Scott Carey 2010-04-01, 02:56
Copy link to this message
Re: clarifications on file format
Jeff Hammerbacher 2010-04-01, 17:05
> The map of metadata key/value pairs begins with a long, then a number of
> string-key/bytes-value pairs.  To be consistent with avro maps, should this
> be followed by a long of 0?  The spec doesn't say explicitly, but if the
> header is described by an avro schema I would suspect yes.

Not sure if this is what you are talking about, but in the Python
implementation (datafile.py) we define an Avro schema for the header:


ETA_SCHEMA schema.parse("""\

{"type": "record", "name":

 "fields" :

   {"name": "magic", "type": {"type": "fixed", "name": "magic", "size":

   {"name": "meta", "type": {"type": "map", "values":

   {"name": "sync", "type": {"type": "fixed", "name": "sync", "size":



Also, some written container files should show up in
https://issues.apache.org/jira/browse/AVRO-230 real soon now.