Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Avro >> mail # user >> Effort towards Avro 2.0?

Copy link to this message
Re: Effort towards Avro 2.0?
It sounds like you're proposing to break language API compatibility.  Are
you also proposing to break wire compatibility for Avro HTTP RPC, Avro
Netty RPC, and/or Avro datafiles?

I'd be appreciative of a mechanism by which systems that happen to use Avro
currently need not be forced to choose one version or another.  (One
approach to this is to use a different package name.)

As for adding to your list, I'd like to see a code-generated API for
Python.  (We like to call these APIs "specific" but I find that terminology

-- Philip
On Mon, Dec 2, 2013 at 1:42 PM, Christophe Taton <[EMAIL PROTECTED]
> wrote:

> Hi all,
> Avro, in its current form, exhibits a number of limitations that are hard
> to work with or around, and hard to fix within the scope of Avro 1.x :
> fixing these issues would introduce incompatible changes that warrant a
> major version bump, ie. Avro 2.0. An Avro 2.0 branch would be an
> opportunity to address most issues that appeared held back for
> compatibility purposes so far.
> I would like to initiate an effort in this direction and I am willing to
> do the necessary work to gather and organize requirements, and draft a
> design document for what Avro 2.0 would look like. For this reason, if you
> have opinions regarding an Avro 2.0 branch or regarding issues and features
> that could fit in Avro 2.0, please reply to this thread.
> To bootstrap, below is a list I gathered over the last couple of years
> from several discussions:
>    - Specification
>    - Improved support for unions (incompatible change with named unions
>       and union properties).
>       - New extension data type, similar to ProtocolBuffer extensions
>       (incompatible change).
>       - Clear separation between Avro schema (data format) and specific
>       API client concerns: for example, the way Avro strings are exposed through
>       the Java API should not pollute the schema definition. Each particular Java
>       client should configure their own decoders with the way they want Avro
>       strings to be represented.
>       - Clarification of compatibility and type promotion (safe lossless
>       conversions vs. best-effort lossy conversions): promoting int to float
>       potentially loses precision, which is not necessarily acceptable for all
>       clients. Avro decoders should let clients configure which mode they need.
>    - IDL
>    - Generalized IDL for Avro schemas.
>       - Support for recursive records.
>       - Meta-schema : IDL definition for a schema.
>       - Java API
>    - Truly immutable schema objects (no properties / hashcode mutation
>       after construction).
>       - Immutable records.
>       - Complete record builder API (current record builders do not play
>       well with nested records).
>       - Complete generic API (there currently is no GenericUnion or
>       GenericMap).
>       - Improved unions support : union values as java.lang.Object are
>       less than ideal; union values could expose the union branch through an enum
>       (nulls could be handled specifically).
>        - Python 3 support
>    - RPC
>       - SASL support
>       - Full Python/Java parity and interoperability.
> Please, comment or extend this list. Provided enough interest, I'll
> happily digest feedback and organize it into a document (most likely a wiki
> page?).
> Thanks,
> Christophe