Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Info required regarding JobTracker Job Details/Metrics


Copy link to this message
-
Info required regarding JobTracker Job Details/Metrics
Hi Users,

I have run a wordount job on a Hadoop 0.20 cluster and the JobTracker Web
UI gave me the following information after the successful completion of the
job:

*Job Counters*
SLOTS_MILLIS_MAPS=5739
Total time spent by all reduces waiting after reserving slots (ms)=0
Total time spent by all maps waiting after reserving slots (ms)=0
Launched map tasks=2
SLOTS_MILLIS_REDUCES=0
**
*FileSystemCounters*
HDFS_BYTES_READ=158
FILE_BYTES_WRITTEN=97422
HDFS_BYTES_WRITTEN=10000
*Map-Reduce Framework*
Map input records=586006142
Reduce shuffle bytes=53567298
Spilled Records=108996063
Map output bytes=468042247685
CPU time spent (ms)=91162220
Total committed heap usage (bytes)=981605744640
Combine input records=32046224559
SPLIT_RAW_BYTES=382500
Reduce input records=96063
Reduce input groups=1000
Combine output records=108902950
Physical memory (bytes) snapshot=1147705057280
Reduce output records=1000
Virtual memory (bytes) snapshot=3221902118912
Map output records=31937417672

Can some one explain me all these above metrics? I mainly want to know the
"total shuffled bytes" of the jobs. Is is "Reduce shuffle bytes"? Also, how
can I calculate the "total shuffle time taken"?
Also, which of the above are the "Map Input Size", "Reduce Input Size" and
"Reduce Output Size"?
I also want to know what is the difference between "FILE_BYTES_WRITTEN and
HDFS_BYTES_WRITTEN. What is it writing outside HDFS which is bigger in size
than HDFS?

Regards,
Gaurav Dasgupta
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB