Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> Info required regarding JobTracker Job Details/Metrics


+
Gaurav Dasgupta 2012-08-23, 10:58
+
Gaurav Dasgupta 2012-08-23, 11:24
+
Bejoy Ks 2012-08-23, 11:50
+
Sonal Goyal 2012-08-23, 11:56
+
Gaurav Dasgupta 2012-08-23, 12:06
Copy link to this message
-
Re: Info required regarding JobTracker Job Details/Metrics
Dont the completed job metrics in the job tracker/or bin/hadoop job
-history provide you the information you seek?

Best Regards,
Sonal
Crux: Reporting for HBase <https://github.com/sonalgoyal/crux>
Nube Technologies <http://www.nubetech.co>

<http://in.linkedin.com/in/sonalgoyal>

On Thu, Aug 23, 2012 at 5:36 PM, Gaurav Dasgupta <[EMAIL PROTECTED]> wrote:

> Hi,
>
> Thanks for your replies.
> Any idea how do I calculate the "total shuffle time"?
> I can get and calculate the total time taken by all the Mappers and all
> the Reducers separatey, but the intermediate shuffle/sort time is absent.
> Any clue?
>
> Thanks,
> Gaurav Dasgupta
>
>
> On Thu, Aug 23, 2012 at 5:26 PM, Sonal Goyal <[EMAIL PROTECTED]>wrote:
>
>> Gaurav,
>>
>> You can also refer to Tom White's Hadoop, The Definitive Guide, Chapter 8
>> which has a reference to each of the job counters. I believe the Apache
>> site also had a page detailing the counters, but I cant seem to locate it.
>>
>> Best Regards,
>> Sonal
>> Crux: Reporting for HBase <https://github.com/sonalgoyal/crux>
>> Nube Technologies <http://www.nubetech.co/>
>>
>> <http://in.linkedin.com/in/sonalgoyal>
>>
>>
>>
>>
>>
>>
>> On Thu, Aug 23, 2012 at 5:20 PM, Bejoy Ks <[EMAIL PROTECTED]> wrote:
>>
>>> Hi Gaurav
>>>
>>> If it is just a simple word count example.
>>> Map input size =  HDFS_BYTES_READ
>>> Reduce Output Size =  HDFS_BYTES_WRITTEN
>>> Reduce Input Size should be Map output bytes
>>>
>>> File Bytes Written is what the job is writing into local file system.
>>> AFAIK it is map task's intermediate output written to LFS.
>>>
>>>
>>> Regrads
>>> Bejoy KS
>>>
>>>
>>> On Thu, Aug 23, 2012 at 4:54 PM, Gaurav Dasgupta <[EMAIL PROTECTED]>wrote:
>>>
>>>> Sorry, the correct outcomes are for the single wordcount job are:
>>>>
>>>> 12/08/23 04:31:22 INFO mapred.JobClient: Job complete:
>>>> job_201208230144_0002
>>>> 12/08/23 04:31:22 INFO mapred.JobClient: Counters: 26
>>>> 12/08/23 04:31:22 INFO mapred.JobClient:   Job Counters
>>>> 12/08/23 04:31:22 INFO mapred.JobClient:     Launched reduce tasks=64
>>>> 12/08/23 04:31:22 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=103718235
>>>> 12/08/23 04:31:22 INFO mapred.JobClient:     Total time spent by all
>>>> reduces waiting after reserving slots (ms)=0
>>>> 12/08/23 04:31:22 INFO mapred.JobClient:     Total time spent by all
>>>> maps waiting after reserving slots (ms)=0
>>>> 12/08/23 04:31:22 INFO mapred.JobClient:     Launched map tasks=3060
>>>> 12/08/23 04:31:22 INFO mapred.JobClient:     Data-local map tasks=3060
>>>> 12/08/23 04:31:22 INFO mapred.JobClient:
>>>> SLOTS_MILLIS_REDUCES=9208855
>>>> 12/08/23 04:31:22 INFO mapred.JobClient:   FileSystemCounters
>>>> 12/08/23 04:31:22 INFO mapred.JobClient:     FILE_BYTES_READ=58263069209
>>>> 12/08/23 04:31:22 INFO mapred.JobClient:
>>>> HDFS_BYTES_READ=394195953674
>>>> 12/08/23 04:31:22 INFO mapred.JobClient:
>>>> FILE_BYTES_WRITTEN=2046757548
>>>> 12/08/23 04:31:22 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=28095
>>>> 12/08/23 04:31:22 INFO mapred.JobClient:   Map-Reduce Framework
>>>> 12/08/23 04:31:22 INFO mapred.JobClient:     Map input records=586006142
>>>> 12/08/23 04:31:22 INFO mapred.JobClient:     Reduce shuffle
>>>> bytes=53567298
>>>> 12/08/23 04:31:22 INFO mapred.JobClient:     Spilled Records=108996063
>>>> 12/08/23 04:31:22 INFO mapred.JobClient:     Map output
>>>> bytes=468042247685
>>>> 12/08/23 04:31:22 INFO mapred.JobClient:     CPU time spent
>>>> (ms)=91162220
>>>> 12/08/23 04:31:22 INFO mapred.JobClient:     Total committed heap usage
>>>> (bytes)=981605744640
>>>> 12/08/23 04:31:22 INFO mapred.JobClient:     Combine input
>>>> records=32046224559
>>>> 12/08/23 04:31:22 INFO mapred.JobClient:     SPLIT_RAW_BYTES=382500
>>>> 12/08/23 04:31:22 INFO mapred.JobClient:     Reduce input records=96063
>>>> 12/08/23 04:31:22 INFO mapred.JobClient:     Reduce input groups=1000
>>>> 12/08/23 04:31:22 INFO mapred.JobClient:     Combine output
>>>> records=108902950
>>>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB