Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop >> mail # user >> Outputformat and RecordWriter in Hadoop Pipes


+
Vivek K 2011-09-13, 16:27
+
Vivek K 2011-09-20, 21:56
Copy link to this message
-
Re: Outputformat and RecordWriter in Hadoop Pipes
Hi,

On Tue, Sep 13, 2011 at 12:27 PM, Vivek K <[EMAIL PROTECTED]> wrote:
> Hi all,
>
> I am trying to build a Hadoop/MR application in c++ using hadoop-pipes. I
> have been able to successfully work with my own mappers and reducers, but
> now I need to generate output (from reducer) in a format different from the
> default TextOutputFormat. I have a few questions:
>
> (1) Similar to Hadoop streaming, is there an option to set OutputFormat in
> HadoopPipes (in order to use say org.apache.hadoop.io.SequenceFile.Writer) ?
> I am using Hadoop version 0.20.2.
>
> (2) For a simple test on how to use an in-built non-default writer, I tried
> the following:
>
>     hadoop pipes -D hadoop.pipes.java.recordreader=true -D
> hadoop.pipes.java.recordwriter=false -input input.seq -output output
> -inputformat org.apache.hadoop.mapred.SequenceFileInputFormat -writer
> org.apache.hadoop.io.SequenceFile.Writer -program my_test_program
-writer wants an outputformat:

      if (results.hasOption("writer")) {
        setIsJavaRecordWriter(job, true);
        job.setOutputFormat(getClass(results, "writer", job,
                                      OutputFormat.class));

As such I think you want:

-writer org.apache.hadoop.mapred.SequenceFileOutputFormat

SequenceFile.Writer simply writes sequence files has nothing todo with
MapReduce.

This is also wrong:

hadoop.pipes.java.recordwriter=false

Brock
+
Vivek K 2011-09-20, 23:04