Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Info required regarding JobTracker Job Details/Metrics


Copy link to this message
-
Re: Info required regarding JobTracker Job Details/Metrics
Hi,

Thanks for your replies.
Any idea how do I calculate the "total shuffle time"?
I can get and calculate the total time taken by all the Mappers and all the
Reducers separatey, but the intermediate shuffle/sort time is absent. Any
clue?

Thanks,
Gaurav Dasgupta
On Thu, Aug 23, 2012 at 5:26 PM, Sonal Goyal <[EMAIL PROTECTED]> wrote:

> Gaurav,
>
> You can also refer to Tom White's Hadoop, The Definitive Guide, Chapter 8
> which has a reference to each of the job counters. I believe the Apache
> site also had a page detailing the counters, but I cant seem to locate it.
>
> Best Regards,
> Sonal
> Crux: Reporting for HBase <https://github.com/sonalgoyal/crux>
> Nube Technologies <http://www.nubetech.co/>
>
> <http://in.linkedin.com/in/sonalgoyal>
>
>
>
>
>
>
> On Thu, Aug 23, 2012 at 5:20 PM, Bejoy Ks <[EMAIL PROTECTED]> wrote:
>
>> Hi Gaurav
>>
>> If it is just a simple word count example.
>> Map input size =  HDFS_BYTES_READ
>> Reduce Output Size =  HDFS_BYTES_WRITTEN
>> Reduce Input Size should be Map output bytes
>>
>> File Bytes Written is what the job is writing into local file system.
>> AFAIK it is map task's intermediate output written to LFS.
>>
>>
>> Regrads
>> Bejoy KS
>>
>>
>> On Thu, Aug 23, 2012 at 4:54 PM, Gaurav Dasgupta <[EMAIL PROTECTED]>wrote:
>>
>>> Sorry, the correct outcomes are for the single wordcount job are:
>>>
>>> 12/08/23 04:31:22 INFO mapred.JobClient: Job complete:
>>> job_201208230144_0002
>>> 12/08/23 04:31:22 INFO mapred.JobClient: Counters: 26
>>> 12/08/23 04:31:22 INFO mapred.JobClient:   Job Counters
>>> 12/08/23 04:31:22 INFO mapred.JobClient:     Launched reduce tasks=64
>>> 12/08/23 04:31:22 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=103718235
>>> 12/08/23 04:31:22 INFO mapred.JobClient:     Total time spent by all
>>> reduces waiting after reserving slots (ms)=0
>>> 12/08/23 04:31:22 INFO mapred.JobClient:     Total time spent by all
>>> maps waiting after reserving slots (ms)=0
>>> 12/08/23 04:31:22 INFO mapred.JobClient:     Launched map tasks=3060
>>> 12/08/23 04:31:22 INFO mapred.JobClient:     Data-local map tasks=3060
>>> 12/08/23 04:31:22 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=9208855
>>> 12/08/23 04:31:22 INFO mapred.JobClient:   FileSystemCounters
>>> 12/08/23 04:31:22 INFO mapred.JobClient:     FILE_BYTES_READ=58263069209
>>> 12/08/23 04:31:22 INFO mapred.JobClient:     HDFS_BYTES_READ=394195953674
>>> 12/08/23 04:31:22 INFO mapred.JobClient:
>>> FILE_BYTES_WRITTEN=2046757548
>>> 12/08/23 04:31:22 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=28095
>>> 12/08/23 04:31:22 INFO mapred.JobClient:   Map-Reduce Framework
>>> 12/08/23 04:31:22 INFO mapred.JobClient:     Map input records=586006142
>>> 12/08/23 04:31:22 INFO mapred.JobClient:     Reduce shuffle
>>> bytes=53567298
>>> 12/08/23 04:31:22 INFO mapred.JobClient:     Spilled Records=108996063
>>> 12/08/23 04:31:22 INFO mapred.JobClient:     Map output
>>> bytes=468042247685
>>> 12/08/23 04:31:22 INFO mapred.JobClient:     CPU time spent (ms)=91162220
>>> 12/08/23 04:31:22 INFO mapred.JobClient:     Total committed heap usage
>>> (bytes)=981605744640
>>> 12/08/23 04:31:22 INFO mapred.JobClient:     Combine input
>>> records=32046224559
>>> 12/08/23 04:31:22 INFO mapred.JobClient:     SPLIT_RAW_BYTES=382500
>>> 12/08/23 04:31:22 INFO mapred.JobClient:     Reduce input records=96063
>>> 12/08/23 04:31:22 INFO mapred.JobClient:     Reduce input groups=1000
>>> 12/08/23 04:31:22 INFO mapred.JobClient:     Combine output
>>> records=108902950
>>> 12/08/23 04:31:22 INFO mapred.JobClient:     Physical memory (bytes)
>>> snapshot=1147705057280
>>> 12/08/23 04:31:22 INFO mapred.JobClient:     Reduce output records=1000
>>> 12/08/23 04:31:22 INFO mapred.JobClient:     Virtual memory (bytes)
>>> snapshot=3221902118912
>>> 12/08/23 04:31:22 INFO mapred.JobClient:     Map output
>>> records=31937417672
>>>
>>>
>>> Thanks,
>>> Gaurav Dasgupta
>>>  On Thu, Aug 23, 2012 at 4:28 PM, Gaurav Dasgupta <[EMAIL PROTECTED]
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB