Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Avro, mail # dev - C library


Copy link to this message
-
Re: C library
Matt Massie 2010-10-14, 15:59
Please continue this discussion on the list since that's what it's for.  I
think it would be great if we could as support for generated code to avro-c.
 I've been itching lately to do some C programming.  Cloudera is having a
Hackathon in about a week so maybe I could dedicate some cycles then to
help.

--
Matt

On Wed, Oct 13, 2010 at 8:58 PM, Bruce Mitchener
<[EMAIL PROTECTED]>wrote:

> On Thu, Oct 14, 2010 at 10:49 AM, Douglas Creager <[EMAIL PROTECTED]
> >wrote:
>
> > > Not to me. :)  I'm assuming that you mean something that uses GValue
> and
> > so
> > > on?
> >
> > Ah, whoops.  No, I'm not suggesting GValue.  *shudder*
> >
>
> *whew*
>
>
> > I was thinking more like using:
> >
> >  • GObject for the schema/datum subclassing
> >  • GHashTable or GTree to store a record schema's fields, etc.
> >  • GIO for the generic I/O interfaces
> >  • GQuark instead of the atom implementation that was checked in and
> >   then reverted
>
>
> Okay, I see ... but that can't happen within the Apache implementation due
> to licensing issues.  (It also doesn't work for my usages because it isn't
> clear that LGPL code can be shipped at all legally on some of my target
> platforms.)
>
>
> > > I don't want the overhead of that sort of thing at all in my C code.
>  I'm
> > > supporting resource constrained platforms, so I just want to go from my
> C
> > > struct straight to a buffer without building an intermediate data
> > structure.
> >
> > We're in violent agreement.  One thing I've started experimenting with
> > is a “streaming” API, so that instead of creating a tree of avro_datum_t
> > instances, the file reader calls a series of callback functions as each
> > bit of data is encountered.  We're generating Avro files from an
> > existing C network sensor application, and it's a bit of overhead (in
> > both code and speed) to have to move between our actual data types and
> > the avro_datum_t instances.
> >
>
> Okay, then we're talking about similar things.  But you can also just
> generate code and then you don't need schemas or anything else at runtime,
> no?
>
> What I'm doing is just a low level API that I can use from generated code.
> I
> don't need (or want) schemas or anything else in the way.
>
> Maybe we should talk more off-list.
>
>  - Bruce
>