|
|
-
Tricks to upgrading Sequence Files?
David Parks 2013-01-29, 07:41
Anyone have any good tricks for upgrading a sequence file.
We maintain a sequence file like a flat file DB and the primary object in there changed in recent development.
It's trivial to write a job to read in the sequence file, update the object, and write it back out in the new format.
But since sequence files read and write the key/value class I would either need to rename the model object with a version number, or change the header of each sequence file.
Just wondering if there are any nice tricks to this.
+
David Parks 2013-01-29, 07:41
-
RE: Tricks to upgrading Sequence Files?
David Parks 2013-01-30, 02:17
I'll consider a patch to the SequenceFile, if we could manually override the sequence file input Key and Value that's read from the sequence file headers we'd have a clean solution.
I don't like versioning my Model object because it's used by 10's of other classes and I don't want to risk less maintained classes continuing to use an old version.
For the time being I just used 2 jobs. First I renamed the old Model Object to the original name, read it in, upgraded it, and wrote the new version with a different class name.
Then I renamed the classes again so the new model object used the original name and read in the altered name and cloned it into the original name.
All in all an hours work only, but having a cleaner process would be better. I'll add the request to JIRA at a minimum.
Dave -----Original Message----- From: Harsh J [mailto:[EMAIL PROTECTED]] Sent: Wednesday, January 30, 2013 2:32 AM To: <[EMAIL PROTECTED]> Subject: Re: Tricks to upgrading Sequence Files?
This is a pretty interesting question, but unfortunately there isn't an inbuilt way in SequenceFiles itself to handle this. However, your key/value classes can be made to handle versioning perhaps - detecting if what they've read is of an older time and decoding it appropriately (while handling newer encoding separately, in the normal fashion). This would be much better than going down the classloader hack paths I think?
On Tue, Jan 29, 2013 at 1:11 PM, David Parks <[EMAIL PROTECTED]> wrote: > Anyone have any good tricks for upgrading a sequence file. > > > > We maintain a sequence file like a flat file DB and the primary object > in there changed in recent development. > > > > It's trivial to write a job to read in the sequence file, update the > object, and write it back out in the new format. > > > > But since sequence files read and write the key/value class I would > either need to rename the model object with a version number, or > change the header of each sequence file. > > > > Just wondering if there are any nice tricks to this.
-- Harsh J
+
David Parks 2013-01-30, 02:17
-
Re: Tricks to upgrading Sequence Files?
Terry Healy 2013-01-30, 15:10
AVROs versioning capability might help if that could replace SequenceFile in your workflow.
Just a thought.
-Terry
On 1/29/13 9:17 PM, David Parks wrote: > I'll consider a patch to the SequenceFile, if we could manually override the > sequence file input Key and Value that's read from the sequence file headers > we'd have a clean solution. > > I don't like versioning my Model object because it's used by 10's of other > classes and I don't want to risk less maintained classes continuing to use > an old version. > > For the time being I just used 2 jobs. First I renamed the old Model Object > to the original name, read it in, upgraded it, and wrote the new version > with a different class name. > > Then I renamed the classes again so the new model object used the original > name and read in the altered name and cloned it into the original name. > > All in all an hours work only, but having a cleaner process would be better. > I'll add the request to JIRA at a minimum. > > Dave > > > -----Original Message----- > From: Harsh J [mailto:[EMAIL PROTECTED]] > Sent: Wednesday, January 30, 2013 2:32 AM > To: <[EMAIL PROTECTED]> > Subject: Re: Tricks to upgrading Sequence Files? > > This is a pretty interesting question, but unfortunately there isn't an > inbuilt way in SequenceFiles itself to handle this. However, your key/value > classes can be made to handle versioning perhaps - detecting if what they've > read is of an older time and decoding it appropriately (while handling newer > encoding separately, in the normal fashion). > This would be much better than going down the classloader hack paths I > think? > > On Tue, Jan 29, 2013 at 1:11 PM, David Parks <[EMAIL PROTECTED]> wrote: >> Anyone have any good tricks for upgrading a sequence file. >> >> >> >> We maintain a sequence file like a flat file DB and the primary object >> in there changed in recent development. >> >> >> >> It's trivial to write a job to read in the sequence file, update the >> object, and write it back out in the new format. >> >> >> >> But since sequence files read and write the key/value class I would >> either need to rename the model object with a version number, or >> change the header of each sequence file. >> >> >> >> Just wondering if there are any nice tricks to this. > > > -- > Harsh J >
+
Terry Healy 2013-01-30, 15:10
|
|