Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Avro >> mail # dev >> Check for schema backwards compatibility


+
Jeff Kolesky 2012-11-16, 23:37
+
Doug Cutting 2012-11-16, 23:54
+
Jeff Kolesky 2012-11-18, 04:04
+
Doug Cutting 2012-11-19, 18:54
Copy link to this message
-
Re: Check for schema backwards compatibility
Schema.java is an unfortunately large file.  Would it be a reasonable
refactor (of course done as a separate unit of work) to pull out the nested
classes of Schema children (NamedSchema, RecordSchema, ArraySchema, etc)
into their own files as package protected classes?  It will make them more
accessible than they are now as private classes, but it would allow the
Schema.java file to be a more manageable size.

Jeff

On Mon, Nov 19, 2012 at 10:54 AM, Doug Cutting <[EMAIL PROTECTED]> wrote:

> I don't feel strongly: a method on Schema would be fine with me as
> would an auxiliary tool class.  Schema.java is a huge file already,
> but I'm not sure that really causes any problems.
>
> On Sat, Nov 17, 2012 at 8:04 PM, Jeff Kolesky <[EMAIL PROTECTED]> wrote:
> > Would it be appropriate to add this method to the Schema class itself in
> > the same way `subsume` and `unify` were, or would you rather see a
> separate
> > tool, similar to SchemaNormalization?
> >
> > On Fri, Nov 16, 2012 at 3:54 PM, Doug Cutting <[EMAIL PROTECTED]>
> wrote:
> >
> >> On Fri, Nov 16, 2012 at 3:37 PM, Jeff Kolesky <[EMAIL PROTECTED]>
> >> wrote:
> >> > Has there been discussion of the need fot this type of tool?  Would
> other
> >> > people find it useful?
> >>
> >> I have not seen this discussed, but I can see the utility.  One could
> >> automatically check new schemas for compatibility with prior versions
> >> before using them, to ensure that both old and new data can be read
> >> with the new schema.  This would require checking that any added
> >> fields have default values specified.
> >>
> >> Related is the ability to tell if an old schema can be used to read
> >> data written with a newer.  This would require that any removed fields
> >> have a default value specified.
> >>
> >> In general, to ensure readability in both cases, one should always
> >> provide a default value for every field.  So a method that traversed a
> >> schema and verified that each field has a default value might suffice.
> >>
> >> Doug
> >>
>
+
Doug Cutting 2012-11-19, 19:18
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB