Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Avro >> mail # dev >> clarifications on file format


Copy link to this message
-
Re: clarifications on file format
> The map of metadata key/value pairs begins with a long, then a number of
> string-key/bytes-value pairs.  To be consistent with avro maps, should this
> be followed by a long of 0?  The spec doesn't say explicitly, but if the
> header is described by an avro schema I would suspect yes.
>

Not sure if this is what you are talking about, but in the Python
implementation (datafile.py) we define an Avro schema for the header:

"""

ETA_SCHEMA schema.parse("""\

{"type": "record", "name":
"org.apache.avro.file.Header",

 "fields" :
[

   {"name": "magic", "type": {"type": "fixed", "name": "magic", "size":
%d}},

   {"name": "meta", "type": {"type": "map", "values":
"bytes"}},

   {"name": "sync", "type": {"type": "fixed", "name": "sync", "size":
%d}}]}

""" % (MAGIC_SIZE, SYNC_SIZE))

"""

Also, some written container files should show up in
https://issues.apache.org/jira/browse/AVRO-230 real soon now.

Thanks,
Jeff
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB