Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Avro >> mail # dev >> C library

Douglas Creager 2010-10-14, 01:45
Bruce Mitchener 2010-10-14, 02:01
Douglas Creager 2010-10-14, 02:43
Bruce Mitchener 2010-10-14, 02:56
Douglas Creager 2010-10-14, 03:49
Bruce Mitchener 2010-10-14, 03:58
Matt Massie 2010-10-14, 15:59
> Please continue this discussion on the list since that's what it's
> for.

Will do

> I think it would be great if we could as support for generated code
> to avro-c. I've been itching lately to do some C programming.
> Cloudera is having a Hackathon in about a week so maybe I could
> dedicate some cycles then to help.

Generated code certainly sounds useful, but I don't know if it will help
my particular problem.  In my case, I'm adding Avro support to an
existing application, which already has quite a few custom C structs
that it's aggregating data into.  With the current implementation, I
have to copy this data into a tree of avro_datum_t instances before
writing the data out to an Avro file.  Codegen would probably make that
a bit easier, but there would still be a set of (now automatically
generated) Avro-specific structs that I'd have to copy into.  What I'm
looking for / working on is a different approach, where I provide a set
of callbacks that tell the Avro file writer how to extract the correct
values directly out of my pre-existing, non-Avro-specific struct.  My
hope is that this will be (a) just as easy to code, and (b) faster,
especially when multiplied by tens of millions of rows.