Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce, mail # user - Info required regarding JobTracker Job Details/Metrics


Copy link to this message
-
Info required regarding JobTracker Job Details/Metrics
Gaurav Dasgupta 2012-08-23, 10:58
Hi Users,

I have run a wordount job on a Hadoop 0.20 cluster and the JobTracker Web
UI gave me the following information after the successful completion of the
job:

*Job Counters*
SLOTS_MILLIS_MAPS=5739
Total time spent by all reduces waiting after reserving slots (ms)=0
Total time spent by all maps waiting after reserving slots (ms)=0
Launched map tasks=2
SLOTS_MILLIS_REDUCES=0
**
*FileSystemCounters*
HDFS_BYTES_READ=158
FILE_BYTES_WRITTEN=97422
HDFS_BYTES_WRITTEN=10000
*Map-Reduce Framework*
Map input records=586006142
Reduce shuffle bytes=53567298
Spilled Records=108996063
Map output bytes=468042247685
CPU time spent (ms)=91162220
Total committed heap usage (bytes)=981605744640
Combine input records=32046224559
SPLIT_RAW_BYTES=382500
Reduce input records=96063
Reduce input groups=1000
Combine output records=108902950
Physical memory (bytes) snapshot=1147705057280
Reduce output records=1000
Virtual memory (bytes) snapshot=3221902118912
Map output records=31937417672

Can some one explain me all these above metrics? I mainly want to know the
"total shuffled bytes" of the jobs. Is is "Reduce shuffle bytes"? Also, how
can I calculate the "total shuffle time taken"?
Also, which of the above are the "Map Input Size", "Reduce Input Size" and
"Reduce Output Size"?
I also want to know what is the difference between "FILE_BYTES_WRITTEN and
HDFS_BYTES_WRITTEN. What is it writing outside HDFS which is bigger in size
than HDFS?

Regards,
Gaurav Dasgupta