Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce, mail # user - Info required regarding JobTracker Job Details/Metrics


+
Gaurav Dasgupta 2012-08-23, 10:58
+
Gaurav Dasgupta 2012-08-23, 11:24
+
Bejoy Ks 2012-08-23, 11:50
+
Sonal Goyal 2012-08-23, 11:56
+
Gaurav Dasgupta 2012-08-23, 12:06
Copy link to this message
-
Re: Info required regarding JobTracker Job Details/Metrics
Sonal Goyal 2012-08-23, 13:20
Dont the completed job metrics in the job tracker/or bin/hadoop job
-history provide you the information you seek?

Best Regards,
Sonal
Crux: Reporting for HBase <https://github.com/sonalgoyal/crux>
Nube Technologies <http://www.nubetech.co>

<http://in.linkedin.com/in/sonalgoyal>

On Thu, Aug 23, 2012 at 5:36 PM, Gaurav Dasgupta <[EMAIL PROTECTED]> wrote:

> Hi,
>
> Thanks for your replies.
> Any idea how do I calculate the "total shuffle time"?
> I can get and calculate the total time taken by all the Mappers and all
> the Reducers separatey, but the intermediate shuffle/sort time is absent.
> Any clue?
>
> Thanks,
> Gaurav Dasgupta
>
>
> On Thu, Aug 23, 2012 at 5:26 PM, Sonal Goyal <[EMAIL PROTECTED]>wrote:
>
>> Gaurav,
>>
>> You can also refer to Tom White's Hadoop, The Definitive Guide, Chapter 8
>> which has a reference to each of the job counters. I believe the Apache
>> site also had a page detailing the counters, but I cant seem to locate it.
>>
>> Best Regards,
>> Sonal
>> Crux: Reporting for HBase <https://github.com/sonalgoyal/crux>
>> Nube Technologies <http://www.nubetech.co/>
>>
>> <http://in.linkedin.com/in/sonalgoyal>
>>
>>
>>
>>
>>
>>
>> On Thu, Aug 23, 2012 at 5:20 PM, Bejoy Ks <[EMAIL PROTECTED]> wrote:
>>
>>> Hi Gaurav
>>>
>>> If it is just a simple word count example.
>>> Map input size =  HDFS_BYTES_READ
>>> Reduce Output Size =  HDFS_BYTES_WRITTEN
>>> Reduce Input Size should be Map output bytes
>>>
>>> File Bytes Written is what the job is writing into local file system.
>>> AFAIK it is map task's intermediate output written to LFS.
>>>
>>>
>>> Regrads
>>> Bejoy KS
>>>
>>>
>>> On Thu, Aug 23, 2012 at 4:54 PM, Gaurav Dasgupta <[EMAIL PROTECTED]>wrote:
>>>
>>>> Sorry, the correct outcomes are for the single wordcount job are:
>>>>
>>>> 12/08/23 04:31:22 INFO mapred.JobClient: Job complete:
>>>> job_201208230144_0002
>>>> 12/08/23 04:31:22 INFO mapred.JobClient: Counters: 26
>>>> 12/08/23 04:31:22 INFO mapred.JobClient:   Job Counters
>>>> 12/08/23 04:31:22 INFO mapred.JobClient:     Launched reduce tasks=64
>>>> 12/08/23 04:31:22 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=103718235
>>>> 12/08/23 04:31:22 INFO mapred.JobClient:     Total time spent by all
>>>> reduces waiting after reserving slots (ms)=0
>>>> 12/08/23 04:31:22 INFO mapred.JobClient:     Total time spent by all
>>>> maps waiting after reserving slots (ms)=0
>>>> 12/08/23 04:31:22 INFO mapred.JobClient:     Launched map tasks=3060
>>>> 12/08/23 04:31:22 INFO mapred.JobClient:     Data-local map tasks=3060
>>>> 12/08/23 04:31:22 INFO mapred.JobClient:
>>>> SLOTS_MILLIS_REDUCES=9208855
>>>> 12/08/23 04:31:22 INFO mapred.JobClient:   FileSystemCounters
>>>> 12/08/23 04:31:22 INFO mapred.JobClient:     FILE_BYTES_READ=58263069209
>>>> 12/08/23 04:31:22 INFO mapred.JobClient:
>>>> HDFS_BYTES_READ=394195953674
>>>> 12/08/23 04:31:22 INFO mapred.JobClient:
>>>> FILE_BYTES_WRITTEN=2046757548
>>>> 12/08/23 04:31:22 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=28095
>>>> 12/08/23 04:31:22 INFO mapred.JobClient:   Map-Reduce Framework
>>>> 12/08/23 04:31:22 INFO mapred.JobClient:     Map input records=586006142
>>>> 12/08/23 04:31:22 INFO mapred.JobClient:     Reduce shuffle
>>>> bytes=53567298
>>>> 12/08/23 04:31:22 INFO mapred.JobClient:     Spilled Records=108996063
>>>> 12/08/23 04:31:22 INFO mapred.JobClient:     Map output
>>>> bytes=468042247685
>>>> 12/08/23 04:31:22 INFO mapred.JobClient:     CPU time spent
>>>> (ms)=91162220
>>>> 12/08/23 04:31:22 INFO mapred.JobClient:     Total committed heap usage
>>>> (bytes)=981605744640
>>>> 12/08/23 04:31:22 INFO mapred.JobClient:     Combine input
>>>> records=32046224559
>>>> 12/08/23 04:31:22 INFO mapred.JobClient:     SPLIT_RAW_BYTES=382500
>>>> 12/08/23 04:31:22 INFO mapred.JobClient:     Reduce input records=96063
>>>> 12/08/23 04:31:22 INFO mapred.JobClient:     Reduce input groups=1000
>>>> 12/08/23 04:31:22 INFO mapred.JobClient:     Combine output
>>>> records=108902950
>>>