Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Avro >> mail # user >> robustness of Schema.Field.pos() across schema versions


Copy link to this message
-
Re: robustness of Schema.Field.pos() across schema versions
John,

I think this will work fine.  The schema in SCHEMA$ is in sync with
the generated code.

You should be able to avoid this copying by instead generating code
with a different base class that contains your methods.  In
particular, it should be easy to modify record.vm to instead use a
subclass of SpecificRecordBase.  Templates are found on the classpath.

We might also add a feature where one can specify an alternate base
class through an API.  This might then be used by the Maven and Ant
tasks.  If that approach sounds useful, please file an issue in Jira.

Doug

On Wed, Jun 6, 2012 at 1:08 PM, John Bates <[EMAIL PROTECTED]> wrote:
> Hi, all.
>
> I'm trying to subclass an Avro IDL-generated class so that it may
> implement an interface used by our project to deserialize data (and
> not necessarily Avro data) from an InputStream.  Ideally, I'd like to
> do something like this:
>
> public class MySubclass extends MyAvroGeneratedClass implements
> MySerializationInterface {
>  @Override
>  public void readObject(InputStream in) throws IOException,
>      ClassNotFoundException {
>
>    // AvroUtil.readObject exists and returns a SpecificRecord given
> an InputStream and Schema
>    MySubclass other = ((MyAvroGeneratedClass) AvroUtil.readObject(in,
> MyAvroGeneratedClass.SCHEMA$));
>
>    // Is this correct?  Is it robust?
>    List<Schema.Field> fields = other.getSchema().getFields();
>    for(Schema.Field field : fields) {
>      field.name();
>      int position = field.pos();
>      this.put(position, other.get(position));
>    }
>  }
> }
>
> I'm trying to avoid having to use the setters and getters supplied by
> the generated class, as that will require this subclass remains in
> sync with the IDL-generated class, which will probably be a point of
> failure.
>
> Is this approach robust to changes in the schema?  That is, if the
> schema changes at some point in the future, will it be possible to
> deserialize data that has been serialized with an older version of the
> schema?  Is there a better (read: more correct, more robust, more
> consistent with Avro's design) way to do this?
>
> I sincerely appreciate your help - I've been blocked for a few days on this.
>
> Thanks in advance,
> John Bates