Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce, mail # dev - One output file per node

Copy link to this message
Re: One output file per node
Radim Kolar 2012-12-14, 01:03
if you have strong data locality demands, then try
http://peregrine_mapreduce.bitbucket.org/ Its 2x faster then hadoop for
multipass job types. It has also very fast node recovery. I plan to do
this for hdfs, concept is similar to "virtual nodes".

Its not hadoop or HDFS compatible and it has no ecosystem. I am not sure
if it is still under development, no commits in last months.