Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # user >> MulitpleOutputs outputs just one line


Copy link to this message
-
MulitpleOutputs outputs just one line
Hi,

I'm trying to utilize MulitpleOutputs ( hadoop 1.0.4 ) to produce multiple
files based on some policy. In the job i set:

MultipleOutputs.addNamedOutput( job, "rejected", TextOutputFormat.class,
Text.class, NullWritable.class );

And at the mapper:

private MultipleOutputs<Text, Writable> mos;

setup(): mos = new MultipleOutputs( context );

map():   if( somecond )
             {
                     context.write( new Text( key ), NullWritable.get() );
             }
             else
             {
                     logger.info( "Going to write to mos: " + key );
                     mos.write( new Text( key ), NullWritable.get(), "/tmp"
);
             }

The problem I'm facing is that if multiple mappers running that code, I can
see at the logs that the mos.write() is being invoked, but only one line is
printed to the output file under /tmp. Is there some config I missed?

Thanks.
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB