Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Info required regarding JobTracker Job Details/Metrics


Copy link to this message
-
Re: Info required regarding JobTracker Job Details/Metrics
Sorry, the correct outcomes are for the single wordcount job are:

12/08/23 04:31:22 INFO mapred.JobClient: Job complete: job_201208230144_0002
12/08/23 04:31:22 INFO mapred.JobClient: Counters: 26
12/08/23 04:31:22 INFO mapred.JobClient:   Job Counters
12/08/23 04:31:22 INFO mapred.JobClient:     Launched reduce tasks=64
12/08/23 04:31:22 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=103718235
12/08/23 04:31:22 INFO mapred.JobClient:     Total time spent by all
reduces waiting after reserving slots (ms)=0
12/08/23 04:31:22 INFO mapred.JobClient:     Total time spent by all maps
waiting after reserving slots (ms)=0
12/08/23 04:31:22 INFO mapred.JobClient:     Launched map tasks=3060
12/08/23 04:31:22 INFO mapred.JobClient:     Data-local map tasks=3060
12/08/23 04:31:22 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=9208855
12/08/23 04:31:22 INFO mapred.JobClient:   FileSystemCounters
12/08/23 04:31:22 INFO mapred.JobClient:     FILE_BYTES_READ=58263069209
12/08/23 04:31:22 INFO mapred.JobClient:     HDFS_BYTES_READ=394195953674
12/08/23 04:31:22 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=2046757548
12/08/23 04:31:22 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=28095
12/08/23 04:31:22 INFO mapred.JobClient:   Map-Reduce Framework
12/08/23 04:31:22 INFO mapred.JobClient:     Map input records=586006142
12/08/23 04:31:22 INFO mapred.JobClient:     Reduce shuffle bytes=53567298
12/08/23 04:31:22 INFO mapred.JobClient:     Spilled Records=108996063
12/08/23 04:31:22 INFO mapred.JobClient:     Map output bytes=468042247685
12/08/23 04:31:22 INFO mapred.JobClient:     CPU time spent (ms)=91162220
12/08/23 04:31:22 INFO mapred.JobClient:     Total committed heap usage
(bytes)=981605744640
12/08/23 04:31:22 INFO mapred.JobClient:     Combine input
records=32046224559
12/08/23 04:31:22 INFO mapred.JobClient:     SPLIT_RAW_BYTES=382500
12/08/23 04:31:22 INFO mapred.JobClient:     Reduce input records=96063
12/08/23 04:31:22 INFO mapred.JobClient:     Reduce input groups=1000
12/08/23 04:31:22 INFO mapred.JobClient:     Combine output
records=108902950
12/08/23 04:31:22 INFO mapred.JobClient:     Physical memory (bytes)
snapshot=1147705057280
12/08/23 04:31:22 INFO mapred.JobClient:     Reduce output records=1000
12/08/23 04:31:22 INFO mapred.JobClient:     Virtual memory (bytes)
snapshot=3221902118912
12/08/23 04:31:22 INFO mapred.JobClient:     Map output records=31937417672
Thanks,
Gaurav Dasgupta
On Thu, Aug 23, 2012 at 4:28 PM, Gaurav Dasgupta <[EMAIL PROTECTED]> wrote:

> Hi Users,
>
> I have run a wordount job on a Hadoop 0.20 cluster and the JobTracker Web
> UI gave me the following information after the successful completion of the
> job:
>
> *Job Counters*
> SLOTS_MILLIS_MAPS=5739
> Total time spent by all reduces waiting after reserving slots (ms)=0
> Total time spent by all maps waiting after reserving slots (ms)=0
> Launched map tasks=2
> SLOTS_MILLIS_REDUCES=0
> **
> *FileSystemCounters*
> HDFS_BYTES_READ=158
> FILE_BYTES_WRITTEN=97422
> HDFS_BYTES_WRITTEN=10000
> *Map-Reduce Framework*
> Map input records=586006142
> Reduce shuffle bytes=53567298
> Spilled Records=108996063
> Map output bytes=468042247685
> CPU time spent (ms)=91162220
> Total committed heap usage (bytes)=981605744640
> Combine input records=32046224559
> SPLIT_RAW_BYTES=382500
> Reduce input records=96063
> Reduce input groups=1000
> Combine output records=108902950
> Physical memory (bytes) snapshot=1147705057280
> Reduce output records=1000
> Virtual memory (bytes) snapshot=3221902118912
> Map output records=31937417672
>
> Can some one explain me all these above metrics? I mainly want to know the
> "total shuffled bytes" of the jobs. Is is "Reduce shuffle bytes"? Also, how
> can I calculate the "total shuffle time taken"?
> Also, which of the above are the "Map Input Size", "Reduce Input Size" and
> "Reduce Output Size"?
> I also want to know what is the difference between "FILE_BYTES_WRITTEN and
> HDFS_BYTES_WRITTEN. What is it writing outside HDFS which is bigger in size
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB