Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Avro >> mail # dev >> Re: Avro C API - Handing of Default values


Copy link to this message
-
RE: Avro C API - Handing of Default values
Hi Doug

Thanks for the tips - I don't quite understand your intention with
adding the default value to the schema.  The schema type (avro_schema_t)
is just a typedef for an avro_obj_t.  Did you mean that I should :

- Add a default value (avro_value_t) to avro_obj_t - which doesn't sound
right - since avro_obj_t looks like it should be a primitive

Or

- Redefine avro_schema_t as a struct with two fields, the original
avro_obj_t, and a default value as the second field?

Or

- Add default values to each of the individual schema types (eg
avro_record_field_t, avro_enum_schema_t, avro_fixed_schema_t, etc.)

Regards,
Steve Roehrs

-----Original Message-----
From: Douglas Creager [mailto:[EMAIL PROTECTED]]
Sent: Tuesday, April 23, 2013 12:41 AM
To: Steve Roehrs; [EMAIL PROTECTED]
Subject: Re: Avro C API - Handing of Default values

> Sorry to contact you off-list but I wasn't sure if you saw my question
> in AVRO-DEV amongst all the JIRA messages.

Hi Steve, you're right, that did slip through the cracks.  Sorry for
that!  CCing the dev list so that we're back on the record.

> You mentioned a few weeks ago that the AVRO C API doesn't handle
default
> values, and for our particular application we need that functionality.
> I'm happy to do that implementing myself and submit a patch, but I
need
> a few pointers on how to get there.  You mentioned that we have all of
> the pieces in place to handle default values - I've been looking over
> the code and  I can see the TODO placeholders in the resolved
> readers/writers etc, but I don't know how to store the generic
'default'
> value itself in the schema.  The Java API uses JsonNodes and converts
> them at resolution time, but I can't see how to do the same in C.
>
> Any help or guidance would be greatly appreciated.

I think I was a bit optimistic when I said all the pieces were there -
from your email to the dev list, it looks like you've found the piece
that's missing: filling in an avro_value_t from a json_t.  I had thought
we had already written that, but it looks like we haven't.  It won't be
too hard to write, though - you can pretty much copy/paste the
avro_value_from_json code, and change the avro_value_get_[scalar] calls
to avro_value_set_[scalar] calls.

With that function available, you'd need to update the avro_schema code
to include an optional default value.  For that, I'd just put an
`avro_value_t` (not an `avro_value_t *`) into the schema type, and
define an accessor and mutator function:

  avro_value_t *
  avro_schema_get_default_value(avro_schema_t schema)
  {
      return &schema->default_value;
  }

  int
  avro_schema_set_default_value(avro_schema_t schema,
                                avro_value_t *value)
  {
      avro_value_move_ref(&schema->default_value, value);
  }

(You can't just store the value pointer in the schema, since
avro_value_t instances are often allocated on the stack.)

Then you'd have to update the avro_schema_from_json function to check
for a default value in the JSON schema text.  If one is there, you'd
need to allocate a new value instance (avro_generic_class_from_schema +
avro_generic_value_new), fill in that value from the JSON content (new
avro_value_from_json function), and then assign it into the schema that
you just created.

Then on the resolution side, in those places where there are TODO
messages about default values, you'd check the reader schema to see if a
default value is available, and if so, use avro_value_copy or
avro_value_copy_ref to return the default value when the caller asks for
a field that isn't present in the writer schema.

> I have modified avro_pipe.c to optionally take an external JSON schema
> file and use a resolved reader/writer as per the example you sent to
> Chris Laws, I will submit that too if you think it's worth it.

Definitely!  That sounds like a great addition.
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB