Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Sequence.Sorter Performance


Copy link to this message
-
Re: Sequence.Sorter Performance
The SequenceFile sorter is ok. It used to be the sort used in the shuffle.
*grin*

Make sure to set io.sort.factor and io.sort.mb to appropriate values for
your hardware. I'd usually use io.sort.factor as 25 * drives and io.sort.mb
is the amount of memory you can allocate to the sorting.

-- Owen