|
|
-
Re: Info required regarding JobTracker Job Details/MetricsGaurav Dasgupta 2012-08-23, 11:24
Sorry, the correct outcomes are for the single wordcount job are:
12/08/23 04:31:22 INFO mapred.JobClient: Job complete: job_201208230144_0002 12/08/23 04:31:22 INFO mapred.JobClient: Counters: 26 12/08/23 04:31:22 INFO mapred.JobClient: Job Counters 12/08/23 04:31:22 INFO mapred.JobClient: Launched reduce tasks=64 12/08/23 04:31:22 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=103718235 12/08/23 04:31:22 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0 12/08/23 04:31:22 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0 12/08/23 04:31:22 INFO mapred.JobClient: Launched map tasks=3060 12/08/23 04:31:22 INFO mapred.JobClient: Data-local map tasks=3060 12/08/23 04:31:22 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=9208855 12/08/23 04:31:22 INFO mapred.JobClient: FileSystemCounters 12/08/23 04:31:22 INFO mapred.JobClient: FILE_BYTES_READ=58263069209 12/08/23 04:31:22 INFO mapred.JobClient: HDFS_BYTES_READ=394195953674 12/08/23 04:31:22 INFO mapred.JobClient: FILE_BYTES_WRITTEN=2046757548 12/08/23 04:31:22 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=28095 12/08/23 04:31:22 INFO mapred.JobClient: Map-Reduce Framework 12/08/23 04:31:22 INFO mapred.JobClient: Map input records=586006142 12/08/23 04:31:22 INFO mapred.JobClient: Reduce shuffle bytes=53567298 12/08/23 04:31:22 INFO mapred.JobClient: Spilled Records=108996063 12/08/23 04:31:22 INFO mapred.JobClient: Map output bytes=468042247685 12/08/23 04:31:22 INFO mapred.JobClient: CPU time spent (ms)=91162220 12/08/23 04:31:22 INFO mapred.JobClient: Total committed heap usage (bytes)=981605744640 12/08/23 04:31:22 INFO mapred.JobClient: Combine input records=32046224559 12/08/23 04:31:22 INFO mapred.JobClient: SPLIT_RAW_BYTES=382500 12/08/23 04:31:22 INFO mapred.JobClient: Reduce input records=96063 12/08/23 04:31:22 INFO mapred.JobClient: Reduce input groups=1000 12/08/23 04:31:22 INFO mapred.JobClient: Combine output records=108902950 12/08/23 04:31:22 INFO mapred.JobClient: Physical memory (bytes) snapshot=1147705057280 12/08/23 04:31:22 INFO mapred.JobClient: Reduce output records=1000 12/08/23 04:31:22 INFO mapred.JobClient: Virtual memory (bytes) snapshot=3221902118912 12/08/23 04:31:22 INFO mapred.JobClient: Map output records=31937417672 Thanks, Gaurav Dasgupta On Thu, Aug 23, 2012 at 4:28 PM, Gaurav Dasgupta <[EMAIL PROTECTED]> wrote: > Hi Users, > > I have run a wordount job on a Hadoop 0.20 cluster and the JobTracker Web > UI gave me the following information after the successful completion of the > job: > > *Job Counters* > SLOTS_MILLIS_MAPS=5739 > Total time spent by all reduces waiting after reserving slots (ms)=0 > Total time spent by all maps waiting after reserving slots (ms)=0 > Launched map tasks=2 > SLOTS_MILLIS_REDUCES=0 > ** > *FileSystemCounters* > HDFS_BYTES_READ=158 > FILE_BYTES_WRITTEN=97422 > HDFS_BYTES_WRITTEN=10000 > *Map-Reduce Framework* > Map input records=586006142 > Reduce shuffle bytes=53567298 > Spilled Records=108996063 > Map output bytes=468042247685 > CPU time spent (ms)=91162220 > Total committed heap usage (bytes)=981605744640 > Combine input records=32046224559 > SPLIT_RAW_BYTES=382500 > Reduce input records=96063 > Reduce input groups=1000 > Combine output records=108902950 > Physical memory (bytes) snapshot=1147705057280 > Reduce output records=1000 > Virtual memory (bytes) snapshot=3221902118912 > Map output records=31937417672 > > Can some one explain me all these above metrics? I mainly want to know the > "total shuffled bytes" of the jobs. Is is "Reduce shuffle bytes"? Also, how > can I calculate the "total shuffle time taken"? > Also, which of the above are the "Map Input Size", "Reduce Input Size" and > "Reduce Output Size"? > I also want to know what is the difference between "FILE_BYTES_WRITTEN and > HDFS_BYTES_WRITTEN. What is it writing outside HDFS which is bigger in size |