Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop, mail # user - Is the sort(in sort and shuffle) always required


Copy link to this message
-
Is the sort(in sort and shuffle) always required
Saptarshi Guha 2010-06-19, 16:16
Hello,
My question: is the sort (in the sort and shuffle) absolutely required?
If I wanted mapreduce to partition (using the map) and then aggregate(using
reduce) without a need for the keys to be sorted
is it possible to turn of the sorting? Or is the fact that keys come to the
reducer in sorted order just a side effect of sorting and that
the sorting is vital for the efficient operation of MapReduce?
Thanks
Saptarshi