Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Parsing the JobTracker Job Logs


Copy link to this message
-
Parsing the JobTracker Job Logs
Hi,
how to parse the log files for our jobs? Are there already classes I can
use?

I need to display some information on a WebInterface (like the native
JobTracker does).
I am talking about this kind of files:

michaela 11:52:59
/var/log/hadoop-0.20-mapreduce/history/done/michaela.ixcloud.net_1363615430691_/2013/03/19/000000
# cat job_201303181503_0864_1363686587824_christian_wordCountJob_15
Meta VERSION="1" .
Job JOBID="job_201303181503_0864" JOBNAME="wordCountJob_15"
USER="christian" SUBMIT_TIME="1363686587824"
JOBCONF="hdfs://carolin\.ixcloud\.net:8020/user/christian/\.staging/job_201303181503_0864/job\.xml"
VIEW_JOB="*" MODIFY_JOB="*" JOB_QUEUE="default" .
Job JOBID="job_201303181503_0864" JOB_PRIORITY="NORMAL" .
Job JOBID="job_201303181503_0864" LAUNCH_TIME="1363686587923"
TOTAL_MAPS="1" TOTAL_REDUCES="1" JOB_STATUS="PREP" .
Task TASKID="task_201303181503_0864_m_000002" TASK_TYPE="SETUP"
START_TIME="1363686587923" SPLITS="" .
MapAttempt TASK_TYPE="SETUP" TASKID="task_201303181503_0864_m_000002"
TASK_ATTEMPT_ID="attempt_201303181503_0864_m_000002_0"
START_TIME="1363686594028"
TRACKER_NAME="tracker_anna\.ixcloud\.net:localhost/127\.0\.0\.1:34657"
HTTP_PORT="50060" .
MapAttempt TASK_TYPE="SETUP" TASKID="task_201303181503_0864_m_000002"
TASK_ATTEMPT_ID="attempt_201303181503_0864_m_000002_0"
TASK_STATUS="SUCCESS" FINISH_TIME="1363686595929"
HOSTNAME="/default/anna\.ixcloud\.net" STATE_STRING="setup"
COUNTERS="{(org\.apache\.hadoop\.mapreduce\.FileSystemCounter)(File System
Counters)[(FILE_BYTES_READ)(FILE: Number of bytes
read)(0)][(FILE_BYTES_WRITTEN)(FILE: Number of bytes
written)(152299)][(FILE_READ_OPS)(FILE: Number of read
operations)(0)][(FILE_LARGE_READ_OPS)(FILE: Number of large read
operations)(0)][(FILE_WRITE_OPS)(FILE: Number of write
operations)(0)][(HDFS_BYTES_READ)(HDFS: Number of bytes
read)(0)][(HDFS_BYTES_WRITTEN)(HDFS: Number of bytes
written)(0)][(HDFS_READ_OPS)(HDFS: Number of read
operations)(0)][(HDFS_LARGE_READ_OPS)(HDFS: Number of large read
operations)(0)][(HDFS_WRITE_OPS)(HDFS: Number of write
operations)(1)]}{(org\.apache\.hadoop\.mapreduce\.TaskCounter)(Map-Reduce
Framework)[(SPILLED_RECORDS)(Spilled Records)(0)][(CPU_MILLISECONDS)(CPU
time spent \\(ms\\))(80)][(PHYSICAL_MEMORY_BYTES)(Physical memory
\\(bytes\\) snapshot)(91693056)][(VIRTUAL_MEMORY_BYTES)(Virtual memory
\\(bytes\\) snapshot)(575086592)][(COMMITTED_HEAP_BYTES)(Total committed
heap usage
\\(bytes\\))(62324736)]}nullnullnullnullnullnullnullnullnullnullnullnullnull"

...
Best Regards,
Christian.
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB