Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> what does "keep 10% map, 40% reduce" mean in gridmix2's README?


Copy link to this message
-
what does "keep 10% map, 40% reduce" mean in gridmix2's README?
Hi, all

I'm using gridmix2 to test my cluster, while in its README file, there are
statements like the following:

+1) Three stage map/reduce job
+   Input:      500GB compressed (2TB uncompressed) SequenceFile
+                 (k,v) = (5 words, 100 words)
+                 hadoop-env: FIXCOMPSEQ
+     *Compute1:   keep 10% map, 40% reduce
+   Compute2:   keep 100% map, 77% reduce
+                 Input from Compute1
+     Compute3:   keep 116% map, 91% reduce
+                 Input from Compute2
+     *Motivation: Many user workloads are implemented as pipelined map/reduce
+                 jobs, including Pig workloads
Can anyone tell me what does "keep 10% map, 40% reduce" mean here?

Best,

--
Nan Zhu
School of Electronic, Information and Electrical Engineering,229
Shanghai Jiao Tong University
800,Dongchuan Road,Shanghai,China
E-Mail: [EMAIL PROTECTED]