-Re: How to set SequenceFile.Metadata from within SequenceFileOutputFormat?
David Rosenstrauch 2010-08-10, 02:52
On 08/09/2010 09:14 PM, Harsh J wrote:
> Another solution would be to create a custom named output using
> mapred.lib.MultipleOutputs and collecting to that instead of the
> job-set output format (which one can set to NullOutputFormat so it
> doesn't complain about existing paths, etc.).
> So if you'd want 'foo' prefix to your 00000-NNNNN numbered output
> files (instead of default 'part'), you'd create it with
> MultipleOutputs.addNamedOutput(Conf, "foo", YourOutFormat.class,
> Key.class, Value.class);
> The extension, I believe, can be changed too, while 'getting' the path
> from the FileOutputFormat while building your RecordWriter. Something
> Path outPath = FileOutputFormat.getTaskOutputPath(job, name+YOUR_EXTENSION);
> // Now create the 'writer' on this path.
Tnx for the tip - didn't know about MultipleOutputs. (Though it's
probably overkill for what I'm doing.)