Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Avro, mail # user - c++ DataFileWriter not doing validation

Copy link to this message
c++ DataFileWriter not doing validation
SCHENK, Jarrad 2013-05-21, 03:22
Hi List,

I'm working with the c++ bindings to try to write data to avro files.

Much of the documentation assumes that the types to be written (and the code to write the data) are generated using avrogencpp.

In my case I have an existing set of type/struct hierarchies that I'm trying to write so I don't want to use the output of avrogencpp directly. Instead I am producing code that is very similar to what avrogencpp outputs but is adapted to suit my types.

What I'm finding is that the c++ DataFileWriter does no validation between the schema that I provide and the datums that get written. As such any discrepancy between the schema and the datums that are written causes the file to be corrupted and essentially unreadable.

I see that there is a ValidatingEncoder class that can be used when serialising to a memorystream (as per the Getting Started docs) but there doesn't appear to be any method for using this encoder with the DataFileWriter.

Am I missing something? Is there a preferred way to make the writer do validation?


The information contained in this email and any attached files is
confidential to BAE Systems Australia. If you are not the intended
recipient, any use, disclosure or copying of this email or any
attachments is expressly prohibited.  If you have received this email
in error, please notify us immediately. VIRUS: Every care has been
taken to ensure this email and its attachments are virus free,
however, any loss or damage incurred in using this email is not the
sender's responsibility.  It is your responsibility to ensure virus
checks are completed before installing any data sent in this email to
your computer."