Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce, mail # user - How / when does On-disk merge work?


+
- 2013-10-25, 19:35
Copy link to this message
-
Re: How / when does On-disk merge work?
Ravi Prakash 2013-10-28, 14:15
Hi!

Tom White's "Hadoop: The Definitive Guide" is probably the best source for information on this (apart from the code itself ;-) Look at MergeManagerImpl.java btw in case you are so inclined).

HTH
Ravi  

On Friday, October 25, 2013 2:36 PM, - <[EMAIL PROTECTED]> wrote:
 
Hi All,

Can anyone provide documentation regarding how on-disk merge on reduce phase works in detail in Hadoop 2.2.0?
There is an explanation in this page but I am afraid it could be outdated since what I observe in my log files is a bunch of "OnDiskMerger - Thread to merge on-disk map-outputs" work at the end of merge phase.

Thanks,
-