Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS, mail # user - MulitpleOutputs outputs just one line


Copy link to this message
-
Re: MulitpleOutputs outputs just one line
Harsh J 2013-01-24, 07:36
Hi Barak,

As instructed on
http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapred/lib/MultipleOutputs.html,
do you also make sure to call the mos.close() function at the end of
Mapper (in its cleanup stage)?

On Thu, Jan 24, 2013 at 12:40 PM, Barak Yaish <[EMAIL PROTECTED]> wrote:
> Hi,
>
> I'm trying to utilize MulitpleOutputs ( hadoop 1.0.4 ) to produce multiple
> files based on some policy. In the job i set:
>
> MultipleOutputs.addNamedOutput( job, "rejected", TextOutputFormat.class,
> Text.class, NullWritable.class );
>
> And at the mapper:
>
> private MultipleOutputs<Text, Writable> mos;
>
> setup(): mos = new MultipleOutputs( context );
>
> map():   if( somecond )
>              {
>                      context.write( new Text( key ), NullWritable.get() );
>              }
>              else
>              {
>                      logger.info( "Going to write to mos: " + key );
>                      mos.write( new Text( key ), NullWritable.get(), "/tmp"
> );
>              }
>
> The problem I'm facing is that if multiple mappers running that code, I can
> see at the logs that the mos.write() is being invoked, but only one line is
> printed to the output file under /tmp. Is there some config I missed?
>
> Thanks.

--
Harsh J