Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # general >> [VOTE] Direction for Hadoop development


Copy link to this message
-
Re: [VOTE] Direction for Hadoop development
On 12/07/2010 10:25 AM, Owen O'Malley wrote:
> The new code reads the new or old versions of SequenceFile seamlessly
> using auto-detection of the version. The old code fails with an explicit
> message saying that it can't read this version. This is the only
> mechanism available when upgrading a file format with a single version
> number and is the mechanism that we've used 6 times in the past.

The last such change was nearly four years ago, in:

https://issues.apache.org/jira/browse/HADOOP-732

The quantity of data stored in SequenceFiles has greatly increased over
the past four years.  The project's concern for compatibility has also
correspondingly increased over that time.

The new format version might not be written when folks are using
Writable or some other serialization currently supported by
SequenceFile.  The only situation in your patch where the new version is
required is for Avro.  You might simply drop support for Avro and leave
the file version number alone since Avro already includes a container
file format.  Or you might only use the new format version for
non-class-determined serializations like Avro.  Or you might use
SequenceFile's existing metadata for non-class-determined serializations
like Avro and leave the file version number alone.

Doug