Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce, mail # user - How to set SequenceFile.Metadata from within SequenceFileOutputFormat?


Copy link to this message
-
Re: How to set SequenceFile.Metadata from within SequenceFileOutputFormat?
David Rosenstrauch 2010-08-10, 02:52
On 08/09/2010 09:14 PM, Harsh J wrote:
> Another solution would be to create a custom named output using
> mapred.lib.MultipleOutputs and collecting to that instead of the
> job-set output format (which one can set to NullOutputFormat so it
> doesn't complain about existing paths, etc.).
>
> So if you'd want 'foo' prefix to your 00000-NNNNN numbered output
> files (instead of default 'part'), you'd create it with
> MultipleOutputs.addNamedOutput(Conf, "foo", YourOutFormat.class,
> Key.class, Value.class);
>
> The extension, I believe, can be changed too, while 'getting' the path
> from the FileOutputFormat while building your RecordWriter. Something
> like:
> Path outPath = FileOutputFormat.getTaskOutputPath(job, name+YOUR_EXTENSION);
> // Now create the 'writer' on this path.

Tnx for the tip - didn't know about MultipleOutputs.  (Though it's
probably overkill for what I'm doing.)

Thanks again,

DR