Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce, mail # dev - sort phase in hadoop mapper

Samaneh Shokuhi 2013-04-18, 08:53
Copy link to this message
Re: sort phase in hadoop mapper
Sandy Ryza 2013-04-18, 18:46
Hi Samaneh,

If you want to see the map outputs post sort/shuffle, the easiest way is
probably to use an IdentityReducer and inspect the job.

Can you be more specific on what you need to disable the sort phase for?
 Sorting is used in part to group map outputs and route them to the correct

On Thu, Apr 18, 2013 at 1:53 AM, Samaneh Shokuhi

> Hello All,
> I am doing some experiments with WordCount  example running on hadoop
> cluster. I have some questions :
> 1) How can i monitor the output from mapper before flushing to reducer? (
> Infact i want to see how the keys are sorted.)
> 2) In one of my experiments i need to disable the sort phase in Mapper and
> send unsorted data to reducer. Is there any way to disable this sort in
> mapper ? or i need to modify hadoop to disable it ?
> As i undestood in MapTask.java  this functionality implemented.
> And ofcourse i dont want to set number of reducer to zero becouse i need to
> have atleast one reducer.
> So any idea how to disable the  sort phase in mapper and monitor the output
> ?
> Best,
> Samaneh
Samaneh Shokuhi 2013-04-18, 22:42
Sandy Ryza 2013-04-19, 01:29