Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Do I have to sort?


Copy link to this message
-
Re: Do I have to sort?
Thank you for the great instructions!

Mark

On Mon, Jun 18, 2012 at 9:53 AM, John Armstrong <[EMAIL PROTECTED]> wrote:

> On 06/18/2012 10:40 AM, Mark Kerzner wrote:
>
>> that sounds very interesting, and I may implement such a workflow, but
>> can I write back to HDFS in the mapper? In the reducer it is a standard
>> context.write(), but it is a different context.
>>
>
> Both Mapper.Context and Reducer.Context descend from
> TaskInputOutputContext, which is where the write() method is defined, so
> they're both outputting their data in the same way.
>
> If you don't have a Reducer -- only Mappers and fully parallel data
> processing -- then when you configure your job you set the number of
> reducers to zero.  Then the mapper context knows that mapper output is the
> last step, so it uses the specified OutputFormat to write out the data,
> just like your reducer context currently does with reducer output.
>