Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Measuring running times


Copy link to this message
-
Re: Measuring running times
At the default log level, Hadoop job logs (the ones you also get in the
job's output directory under _logs/history) contain entries like the
following:

ReduceAttempt TASK_TYPE="REDUCE" TASKID="tip_200809020551_0008_r_000002"
TASK_ATTEMPT_ID="task_200809020551_0008_r_000002_0"
START_TIME="1220331166789"
HOSTNAME="tracker_foo.bar.com:localhost/127.0.0.1:44755"

ReduceAttempt TASK_TYPE="REDUCE" TASKID="tip_200809020551_0008_r_000002"
TASK_ATTEMPT_ID="task_200809020551_0008_r_000002_0"
TASK_STATUS="SUCCESS" SHUFFLE_FINISHED="1220332036001"
SORT_FINISHED="1220332036014" FINISH_TIME="1220332063254"
HOSTNAME="tracker_foo.bar.com:localhost/127.0.0.1:44755"

You get start time, shuffle finish time, sort finish time and overall
finish time. Similarly, you get start and finish time for MapAttempt
entries.

Hope this helps,

Simone

On 03/17/10 12:47, Antonio D'Ettole wrote:
> Hi everybody,
> as part of my project work at school I'm running some Hadoop jobs on a
> cluster. I'd like to measure exactly how long each phase of the process
> takes: mapping, shuffling (ideally divided in copying and sorting) and
> reducing. The tasktracker logs do not seem to supply the start/end times for
> each phase, at least not all of them, even when the log level is set to
> DEBUG.
> Do you have any ideas on how I could work this out?
> Thanks
> Antonio
>
--
Simone Leo
Distributed Computing group
Advanced Computing and Communications program
CRS4
POLARIS - Building #1
Piscina Manna
I-09010 Pula (CA) - Italy
e-mail: [EMAIL PROTECTED]
http://www.crs4.it
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB