Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Avro >> mail # user >> Schema evolution and projection


+
Chris Laws 2013-02-28, 13:21
Copy link to this message
-
Re: Schema evolution and projection
I'm not familiar with the C implementation, but it should follow the
resolution rules from the specification:

http://avro.apache.org/docs/current/spec.html#Schema+Resolution

We call it "projection" when schema resolution is used with a subset
schema as the reader's schema.  A subset is created by removing fields
from the writer's schema that are not required.

Does that help?

Doug

On Thu, Feb 28, 2013 at 5:21 AM, Chris Laws <[EMAIL PROTECTED]> wrote:
> Hi,
>
> I am struggling to familiarise myself with schema evolution and schema
> projection using the avro-c implementation.
>
> There doesn't seem to be much information available on how to perform these
> tasks. The examples on the C API page confusingly mix the old datum API with
> the new value API.
>
> I have built what I think is a really simple example of testing schema
> projection but it does not work the way I think it should work - more than
> likely my understanding is wrong.
>
> Where I ask for one particular field (by specifying the field name) of a
> record to be retrieved I instead get every field that matches the request
> type.
>
> The attached file projection_01.c (attached and at
> https://gist.github.com/claws/5056626) defines a really simple record with
> If I avrocat the container file I see:
> {"Field_1": 1, "Field_2": 1}
> {"Field_1": 2, "Field_2": 2}
> {"Field_1": 3, "Field_2": 3}
> {"Field_1": 4, "Field_2": 4}
> {"Field_1": 5, "Field_2": 5}
>
> The projection schema being used is a record only containing Field_2 of type
> int. I only expected that field to be returned by the reader yet I receive
> every int type field, confusingly labelled as "Field_2".
>
> When I run projection_01.c I see:
> {"Field_2": 1}
> {"Field_2": 1}
> {"Field_2": 2}
> {"Field_2": 2}
> {"Field_2": 3}
> {"Field_2": 3}
> {"Field_2": 4}
> {"Field_2": 4}
> {"Field_2": 5}
> {"Field_2": 5}
>
> Is this how schema projection is supposed to work? Does it just return items
> of the same type irrespective of the field name specified?
>
> I think I am missing something about how this is supposed to work. Perhaps
> my example record is too simple.
>
> So, I then created a slightly more complex schema that contained a
> sub-record and the projection seems to work how I think it should work. This
> can be seen in the output from projection_02.c (attached and at
> https://gist.github.com/claws/5056643) which returns:
> {"Field_2": {"SubField_1": 1, "SubField_2": 42}}
> {"Field_2": {"SubField_1": 24, "SubField_2": 3}}
> {"Field_2": {"SubField_1": 2, "SubField_2": 42}}
> {"Field_2": {"SubField_1": 24, "SubField_2": 3}}
> {"Field_2": {"SubField_1": 3, "SubField_2": 42}}
> {"Field_2": {"SubField_1": 24, "SubField_2": 3}}
> {"Field_2": {"SubField_1": 4, "SubField_2": 42}}
> {"Field_2": {"SubField_1": 24, "SubField_2": 3}}
> {"Field_2": {"SubField_1": 5, "SubField_2": 42}}
> {"Field_2": {"SubField_1": 24, "SubField_2": 3}}
>
> From this trial and error it appears that the projection will return me
> values that match the projection schema's types - but does not take into
> account any 'name' fields. Would that be an accurate assessment?
>
> Can anyone provide some more information on schema projection?
> Is there a good example anywhere?
>
> Regards,
> Chris
>
>
>
>
+
Douglas Creager 2013-02-28, 21:01
+
Chris Laws 2013-02-28, 21:13
+
Douglas Creager 2013-02-28, 22:38
+
Chris Laws 2013-03-01, 01:50
+
Martin Kleppmann 2013-03-01, 13:53
+
Chris Laws 2013-03-01, 22:26
+
Douglas Creager 2013-03-01, 23:34
+
Chris Laws 2013-03-02, 02:05
+
Doug Cutting 2013-03-01, 22:47
+
Doug Cutting 2013-03-01, 01:30
+
Chris Laws 2013-03-01, 01:41
+
Chris Laws 2013-03-01, 01:14
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB