Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Avro >> mail # user >> Embedding schema with binary encoding

Copy link to this message
Re: Embedding schema with binary encoding
In an Avro file, it always writes the schema in JSON form in the header.
There may be an old JIRA ticket considering the possibility of writing the
schema in a more compact form.    The data in the file is always encoded in
Avro binary form, optionally with snappy or deflate(gzip) compression and
with a variable block size.

On 1/8/13 1:49 AM, "Pratyush Chandra" <[EMAIL PROTECTED]> wrote:

> Hi Scott,
> I am able to find example for json encoding with DataFileWriter which embedds
> schema, but unable to find DataFileWriter example for binary encoding with
> schema.
> Thanks
> Pratyush
> On Tue, Jan 8, 2013 at 2:56 PM, Scott Carey <[EMAIL PROTECTED]> wrote:
>> Calling toJson() on a Schema will print it in json fom.  However you most
>> likely do not want to invent your own file format for Avro data.
>> DataFileWriter which will manage the schema for you, along with compression,
>> metadata, and the ability to seek to the middle of the file.    Additionally
>> it is then readable by several other languages and tools.
>> On 1/7/13 4:42 AM, "Pratyush Chandra" <[EMAIL PROTECTED]> wrote:
>>> I am able to serialize with binary encoding to a file using following :
>>>         FileOutputStream outputStream = new FileOutputStream(file);
>>>         Encoder e = EncoderFactory.get().binaryEncoder(outputStream, null);
>>>         DatumWriter<GenericRecord> datumWriter = new
>>> GenericDatumWriter<GenericRecord>(schema);
>>>         GenericRecord message1= new GenericData.Record(schema);
>>>         message1.put("to", "Alyssa");
>>>         datumWriter.write(message1, e);
>>>         e.flush();
>>>         outputStream.close();
>>> But the output file contains only serialized data and not schema. How can I
>>> add schema also ?
>>> Thanks
>>> Pratyush Chandra
> --
> Pratyush Chandra