Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce, mail # dev - One output file per node


+
Cardon, Tejay E 2012-12-12, 18:02
+
Aloke Ghoshal 2012-12-13, 07:47
+
Robert Evans 2012-12-13, 14:56
Copy link to this message
-
Re: One output file per node
Radim Kolar 2012-12-14, 01:03
if you have strong data locality demands, then try
http://peregrine_mapreduce.bitbucket.org/ Its 2x faster then hadoop for
multipass job types. It has also very fast node recovery. I plan to do
this for hdfs, concept is similar to "virtual nodes".

Its not hadoop or HDFS compatible and it has no ecosystem. I am not sure
if it is still under development, no commits in last months.

https://bitbucket.org/burtonator/peregrine/commits
+
Radim Kolar 2012-12-13, 03:00