Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> what does "keep 10% map, 40% reduce" mean in gridmix2's README?


Copy link to this message
-
what does "keep 10% map, 40% reduce" mean in gridmix2's README?
Hi, all

I'm using gridmix2 to test my cluster, while in its README file, there are
statements like the following:

+1) Three stage map/reduce job
+   Input:      500GB compressed (2TB uncompressed) SequenceFile
+                 (k,v) = (5 words, 100 words)
+                 hadoop-env: FIXCOMPSEQ
+     *Compute1:   keep 10% map, 40% reduce
+   Compute2:   keep 100% map, 77% reduce
+                 Input from Compute1
+     Compute3:   keep 116% map, 91% reduce
+                 Input from Compute2
+     *Motivation: Many user workloads are implemented as pipelined map/reduce
+                 jobs, including Pig workloads
Can anyone tell me what does "keep 10% map, 40% reduce" mean here?

Best,

--
Nan Zhu
School of Electronic, Information and Electrical Engineering,229
Shanghai Jiao Tong University
800,Dongchuan Road,Shanghai,China
E-Mail: [EMAIL PROTECTED]
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB