Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # general >> Lost tasktracker errors


Copy link to this message
-
Re: Lost tasktracker errors
Is there anything in the task tracker's logs?  Did the machines go down?
Are there full disks on those nodes?

--Bobby

On 1/4/13 5:52 AM, "Royston Sellman" <[EMAIL PROTECTED]>
wrote:

>I'm running a job over a 380 billion row 20 TB dataset which is computing
>sum(), max() etc. The job is running fine at around 3 million rows per
>second for several hours then grinding to a halt as it loses one after
>another of the tasktrackers.  We see a healthy mix of successful map and
>reduce attempts on the tasktracker...
>
>
>
>2013-01-03 23:41:40,249 INFO org.apache.hadoop.mapred.TaskTracker:
>attempt_201301031813_0001_m_041109_0 1.0%
>
>2013-01-03 23:41:40,256 INFO org.apache.hadoop.mapred.TaskTracker:
>attempt_201301031813_0001_m_041105_0 1.0%
>
>2013-01-03 23:41:40,260 INFO org.apache.hadoop.mapred.TaskTracker:
>attempt_201301031813_0001_m_041105_0 1.0%
>
>2013-01-03 23:41:40,261 INFO org.apache.hadoop.mapred.TaskTracker: Task
>attempt_201301031813_0001_m_041105_0 is done.
>
>2013-01-03 23:41:40,261 INFO org.apache.hadoop.mapred.TaskTracker:
>reported
>output size for attempt_201301031813_0001_m_041105_0  was 111
>
>2013-01-03 23:41:40,261 INFO org.apache.hadoop.mapred.TaskTracker:
>addFreeSlot : current free slots : 8
>
>2013-01-03 23:41:40,374 INFO org.apache.hadoop.mapred.TaskTracker:
>attempt_201301031813_0001_m_041106_0 0.9884119%
>
>2013-01-03 23:41:40,432 INFO org.apache.hadoop.mapred.JvmManager: JVM :
>jvm_201301031813_0001_m_2021872807 exited with exit code 0. Number of
>tasks
>it ran: 1
>
>2013-01-03 23:41:40,807 INFO org.apache.hadoop.mapred.TaskTracker:
>attempt_201301031813_0001_m_041103_0 0.9884134%
>
>2013-01-03 23:41:43,190 INFO org.apache.hadoop.mapred.TaskTracker:
>attempt_201301031813_0001_m_041101_0 1.0%
>
>2013-01-03 23:41:43,193 INFO org.apache.hadoop.mapred.TaskTracker:
>attempt_201301031813_0001_m_041101_0 1.0%
>
>2013-01-03 23:41:43,194 INFO org.apache.hadoop.mapred.TaskTracker: Task
>attempt_201301031813_0001_m_041101_0 is done.
>
>2013-01-03 23:41:43,194 INFO org.apache.hadoop.mapred.TaskTracker:
>reported
>output size for attempt_201301031813_0001_m_041101_0  was 111
>
>2013-01-03 23:41:43,194 INFO org.apache.hadoop.mapred.TaskTracker:
>addFreeSlot : current free slots : 9
>
>2013-01-03 23:41:43,303 INFO org.apache.hadoop.mapred.TaskTracker:
>attempt_201301031813_0001_m_041109_0 1.0%
>
>2013-01-03 23:41:43,306 INFO org.apache.hadoop.mapred.TaskTracker:
>attempt_201301031813_0001_m_041109_0 1.0%
>
>2013-01-03 23:41:43,307 INFO org.apache.hadoop.mapred.TaskTracker: Task
>attempt_201301031813_0001_m_041109_0 is done.
>
>2013-01-03 23:41:43,307 INFO org.apache.hadoop.mapred.TaskTracker:
>reported
>output size for attempt_201301031813_0001_m_041109_0  was 111
>
>2013-01-03 23:41:43,307 INFO org.apache.hadoop.mapred.TaskTracker:
>addFreeSlot : current free slots : 10
>
>2013-01-03 23:41:43,361 INFO org.apache.hadoop.mapred.JvmManager: JVM :
>jvm_201301031813_0001_m_36690963 exited with exit code 0. Number of tasks
>it
>ran: 1
>
>2013-01-03 23:41:43,428 INFO org.apache.hadoop.mapred.TaskTracker:
>attempt_201301031813_0001_m_041106_0 1.0%
>
>2013-01-03 23:41:43,432 INFO org.apache.hadoop.mapred.TaskTracker:
>attempt_201301031813_0001_m_041106_0 1.0%
>
>2013-01-03 23:41:43,433 INFO org.apache.hadoop.mapred.TaskTracker: Task
>attempt_201301031813_0001_m_041106_0 is done.
>
>2013-01-03 23:41:43,433 INFO org.apache.hadoop.mapred.TaskTracker:
>reported
>output size for attempt_201301031813_0001_m_041106_0  was 111
>
>2013-01-03 23:41:43,433 INFO org.apache.hadoop.mapred.TaskTracker:
>addFreeSlot : current free slots : 11
>
>2013-01-03 23:41:43,457 INFO org.apache.hadoop.mapred.JvmManager: JVM :
>jvm_201301031813_0001_m_-2095784622 exited with exit code 0. Number of
>tasks
>it ran: 1
>
>2013-01-03 23:41:43,595 INFO org.apache.hadoop.mapred.JvmManager: JVM :
>jvm_201301031813_0001_m_1190449426 exited with exit code 0. Number of
>tasks
>it ran: 1
>
>2013-01-03 23:41:43,862 INFO org.apache.hadoop.mapred.TaskTracker:
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB