|
|
+
Leo Leung 2012-08-29, 17:11
-
MRBench Maps strange behaviourGaurav Dasgupta 2012-08-28, 10:32
Hi All,
I executed the "MRBench" program from "hadoop-test.jar" in my 12 node CDH3 cluster. After executing, I had some strange observations regarding the number of Maps it ran. First I ran the command: hadoop jar /usr/lib/hadoop-0.20/hadoop-test.jar mrbench -numRuns 3 -maps 200 -reduces 200 -inputLines 1024 -inputType random And I could see that the actual number of Maps it ran was 201 (for all the 3 runs) instead of 200 (Though the end report displays the launched to be 200). Here is the console report: 12/08/28 04:34:35 INFO mapred.JobClient: Job complete: job_201208230144_0035 12/08/28 04:34:35 INFO mapred.JobClient: Counters: 28 12/08/28 04:34:35 INFO mapred.JobClient: Job Counters 12/08/28 04:34:35 INFO mapred.JobClient: Launched reduce tasks=200 12/08/28 04:34:35 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=617209 12/08/28 04:34:35 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0 12/08/28 04:34:35 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0 12/08/28 04:34:35 INFO mapred.JobClient: Rack-local map tasks=137 *12/08/28 04:34:35 INFO mapred.JobClient: Launched map tasks=201* 12/08/28 04:34:35 INFO mapred.JobClient: Data-local map tasks=64 12/08/28 04:34:35 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=1756882 Again, I ran the MRBench for just 10 Maps and 10 Reduces: hadoop jar /usr/lib/hadoop-0.20/hadoop-test.jar mrbench -maps 10 -reduces 10 This time the actual number of Maps were only 2 and again the end report displays Maps Lauched to be 10. The console output: 12/08/28 05:05:35 INFO mapred.JobClient: Job complete: job_201208230144_0040 12/08/28 05:05:35 INFO mapred.JobClient: Counters: 27 12/08/28 05:05:35 INFO mapred.JobClient: Job Counters 12/08/28 05:05:35 INFO mapred.JobClient: Launched reduce tasks=20 12/08/28 05:05:35 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=6648 12/08/28 05:05:35 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0 12/08/28 05:05:35 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0 *12/08/28 05:05:35 INFO mapred.JobClient: Launched map tasks=2 *12/08/28 05:05:35 INFO mapred.JobClient: Data-local map tasks=2 12/08/28 05:05:35 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=163257 12/08/28 05:05:35 INFO mapred.JobClient: FileSystemCounters 12/08/28 05:05:35 INFO mapred.JobClient: FILE_BYTES_READ=407 12/08/28 05:05:35 INFO mapred.JobClient: HDFS_BYTES_READ=258 12/08/28 05:05:35 INFO mapred.JobClient: FILE_BYTES_WRITTEN=1072596 12/08/28 05:05:35 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=3 12/08/28 05:05:35 INFO mapred.JobClient: Map-Reduce Framework 12/08/28 05:05:35 INFO mapred.JobClient: Map input records=1 12/08/28 05:05:35 INFO mapred.JobClient: Reduce shuffle bytes=647 12/08/28 05:05:35 INFO mapred.JobClient: Spilled Records=2 12/08/28 05:05:35 INFO mapred.JobClient: Map output bytes=5 12/08/28 05:05:35 INFO mapred.JobClient: CPU time spent (ms)=17070 12/08/28 05:05:35 INFO mapred.JobClient: Total committed heap usage (bytes)=6218842112 12/08/28 05:05:35 INFO mapred.JobClient: Map input bytes=2 12/08/28 05:05:35 INFO mapred.JobClient: Combine input records=0 12/08/28 05:05:35 INFO mapred.JobClient: SPLIT_RAW_BYTES=254 12/08/28 05:05:35 INFO mapred.JobClient: Reduce input records=1 12/08/28 05:05:35 INFO mapred.JobClient: Reduce input groups=1 12/08/28 05:05:35 INFO mapred.JobClient: Combine output records=0 12/08/28 05:05:35 INFO mapred.JobClient: Physical memory (bytes) snapshot=3348828160 12/08/28 05:05:35 INFO mapred.JobClient: Reduce output records=1 12/08/28 05:05:35 INFO mapred.JobClient: Virtual memory (bytes) snapshot=22955810816 12/08/28 05:05:35 INFO mapred.JobClient: Map output records=1 *DataLines Maps Reduces AvgTime (milliseconds) 1 20 20 17451 * Can some one please help me understand this behaviour of Hadoop in this case. My main purpose of running a MRBench is to calculate the Average time for certain amount of Maps, Reduces, InputLines etc. If the number of Maps is not what I submitted, then how can I judge my benchmark results? Thanks, Gaurav Dasgupta +
Hemanth Yamijala 2012-08-29, 05:56
+
Gaurav Dasgupta 2012-08-29, 07:44
+
Hemanth Yamijala 2012-08-29, 08:31
+
Bejoy KS 2012-08-29, 07:50
|