Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Sequence.Sorter Performance


Copy link to this message
-
Re: Sequence.Sorter Performance
The SequenceFile sorter is ok. It used to be the sort used in the shuffle.
*grin*

Make sure to set io.sort.factor and io.sort.mb to appropriate values for
your hardware. I'd usually use io.sort.factor as 25 * drives and io.sort.mb
is the amount of memory you can allocate to the sorting.

-- Owen
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB