|
|
-
Map job hangs indefinitely
Sudharsan Sampath 2011-06-22, 06:21
Hi,
I am starting a job from the map of another job. Following are quick mock of the code snippets that I use. But the 2nd job hangs indefinitely after the 1st task attempt fails. There is not even a 2nd attempt. This runs fine on a cluster with one node but fails on a two node cluster.
Can someone help me in understanding why the failed attempt was unable to be rescheduled and thereby hangs the job.
Thanks Sudhan S
+
Sudharsan Sampath 2011-06-22, 06:21
-
RE: Map job hangs indefinitely
Devaraj K 2011-06-22, 06:43
With this info it is difficult to find out where the problem is coming. Can you check the job tracker and task tracker logs related to these jobs?
Devaraj K
_____
From: Sudharsan Sampath [mailto:[EMAIL PROTECTED]] Sent: Wednesday, June 22, 2011 11:51 AM To: [EMAIL PROTECTED] Subject: Map job hangs indefinitely
Hi,
I am starting a job from the map of another job. Following are quick mock of the code snippets that I use. But the 2nd job hangs indefinitely after the 1st task attempt fails. There is not even a 2nd attempt. This runs fine on a cluster with one node but fails on a two node cluster.
Can someone help me in understanding why the failed attempt was unable to be rescheduled and thereby hangs the job.
Thanks Sudhan S
+
Devaraj K 2011-06-22, 06:43
-
Re: Map job hangs indefinitely
Sudharsan Sampath 2011-06-22, 09:20
Hi Devraj,
I attached the files so that it is easier for anyone to run it and simulate the issue. There are no other files required.
following are the logs from the jobtracker and the tasktracker
*JobTracker*
2011-06-23 12:46:48,781 DEBUG org.apache.hadoop.mapred.JobTracker: Per-Task memory configuration is not set on JT. Not checking the job for invalid memory requirements. 2011-06-23 12:46:48,783 INFO org.apache.hadoop.mapred.JobTracker: Initializing job_201106231235_0001 2011-06-23 12:46:48,783 INFO org.apache.hadoop.mapred.JobInProgress: Initializing job_201106231235_0001 2011-06-23 12:46:48,872 INFO org.apache.hadoop.mapred.JobInProgress: Input size for job job_201106231235_0001 = 0. Number of splits = 1 2011-06-23 12:46:49,132 DEBUG org.apache.hadoop.mapred.JobTracker: Got heartbeat from: TASK_TRACKER2 (restarted: false initialContact: false acceptNewTasks: true) with responseId: 183 2011-06-23 12:46:49,157 INFO org.apache.hadoop.mapred.JobTracker: Adding task 'attempt_201106231235_0001_m_000002_0' to tip task_201106231235_0001_m_000002, for tracker 'TASK_TRACKER2' 2011-06-23 12:46:49,158 DEBUG org.apache.hadoop.mapred.JobTracker: TASK_TRACKER2 -> LaunchTask: attempt_201106231235_0001_m_000002_0 2011-06-23 12:46:50,943 DEBUG org.apache.hadoop.mapred.JobTracker: Got heartbeat from: TASK_TRACKER1 (restarted: false initialContact: false acceptNewTasks: true) with responseId: 159 2011-06-23 12:46:52,203 DEBUG org.apache.hadoop.mapred.JobTracker: Got heartbeat from: TASK_TRACKER2 (restarted: false initialContact: false acceptNewTasks: true) with responseId: 184 2011-06-23 12:46:52,204 INFO org.apache.hadoop.mapred.JobInProgress: Task 'attempt_201106231235_0001_m_000002_0' has completed task_201106231235_0001_m_000002 successfully. 2011-06-23 12:46:52,208 INFO org.apache.hadoop.mapred.JobTracker: Adding task 'attempt_201106231235_0001_m_000000_0' to tip task_201106231235_0001_m_000000, for tracker 'TASK_TRACKER2' 2011-06-23 12:46:52,210 DEBUG org.apache.hadoop.mapred.JobTracker: TASK_TRACKER2 -> LaunchTask: attempt_201106231235_0001_m_000000_0 2011-06-23 12:46:52,211 DEBUG org.apache.hadoop.mapred.JobTracker: TASK_TRACKER2 -> KillTaskAction: attempt_201106231235_0001_m_000002_0
12:46:53,308 DEBUG org.apache.hadoop.mapred.JobTracker: Per-Task memory configuration is not set on JT. Not checking the job for invalid memory requirements. 2011-06-23 12:46:53,309 INFO org.apache.hadoop.mapred.JobTracker: Initializing job_201106231235_0002 2011-06-23 12:46:53,309 INFO org.apache.hadoop.mapred.JobInProgress: Initializing job_201106231235_0002 2011-06-23 12:46:53,380 INFO org.apache.hadoop.mapred.JobInProgress: Input size for job job_201106231235_0002 = 0. Number of splits = 1 2011-06-23 12:46:53,946 DEBUG org.apache.hadoop.mapred.JobTracker: Got heartbeat from: TASK_TRACKER1 (restarted: false initialContact: false acceptNewTasks: true) with responseId: 160 2011-06-23 12:46:53,947 INFO org.apache.hadoop.mapred.JobTracker: Adding task 'attempt_201106231235_0002_m_000002_0' to tip task_201106231235_0002_m_000002, for tracker 'TASK_TRACKER1' 2011-06-23 12:46:53,947 DEBUG org.apache.hadoop.mapred.JobTracker: TASK_TRACKER2 -> LaunchTask: attempt_201106231235_0002_m_000002_0 2011-06-23 12:46:55,215 DEBUG org.apache.hadoop.mapred.JobTracker: Got heartbeat from: TASK_TRACKER2 (restarted: false initialContact: false acceptNewTasks: true) with responseId: 185 2011-06-23 12:46:56,989 DEBUG org.apache.hadoop.mapred.JobTracker: Got heartbeat from: TASK_TRACKER1 (restarted: false initialContact: false acceptNewTasks: true) with responseId: 161 2011-06-23 12:46:57,042 INFO org.apache.hadoop.mapred.JobInProgress: Task 'attempt_201106231235_0002_m_000002_0' has completed task_201106231235_0002_m_000002 successfully. 2011-06-23 12:46:57,044 INFO org.apache.hadoop.mapred.JobTracker: Adding task 'attempt_201106231235_0002_m_000000_0' to tip task_201106231235_0002_m_000000, for tracker 'TASK_TRACKER1' 2011-06-23 12:46:57,044 DEBUG org.apache.hadoop.mapred.JobTracker: TASK_TRACKER1 -> LaunchTask: attempt_201106231235_0002_m_000000_0 2011-06-23 12:46:57,044 DEBUG org.apache.hadoop.mapred.JobTracker: TASK_TRACKER1 -> KillTaskAction: attempt_201106231235_0002_m_000002_0 2011-06-23 12:46:58,219 DEBUG org.apache.hadoop.mapred.JobTracker: Got heartbeat from: TASK_TRACKER2 (restarted: false initialContact: false acceptNewTasks: true) with responseId: 186 2011-06-23 12:47:00,049 DEBUG org.apache.hadoop.mapred.JobTracker: Got heartbeat from: TASK_TRACKER1 (restarted: false initialContact: false acceptNewTasks: true) with responseId: 162 2011-06-23 12:47:00,049 INFO org.apache.hadoop.mapred.TaskInProgress: Error from attempt_201106231235_0002_m_000000_0: java.lang.RuntimeException: Throwing own exception at com.test.MyMapper.map(MyMapper.java:26) at com.test.MyMapper.map(MyMapper.java:1) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) at org.apache.hadoop.mapred.Child.main(Child.java:170)
2011-06-23 12:47:01,222 DEBUG org.apache.hadoop.mapred.JobTracker: Got heartbeat from: TASK_TRACKER2 (restarted: false initialContact: false acceptNewTasks: true) with responseId: 187 2011-06-23 12:47:03,052 DEBUG org.apache.hadoop.mapred.JobTracker: Got heartbeat from: TASK_TRACKER1 (restarted: false initialContact: false acceptNewTasks: true) with responseId: 163 2011-06-23 12:47:03,053 DEBUG org.apache.hadoop.mapred.JobTracker: Marked 'attempt_201106231235_0002_m_000000_0' from 'TASK_TRACKER1' 2011-06-23 12:47:03,054 DEBUG org.apache.hadoop.mapred.JobTracker: Removing task 'attempt_201106231235_0002_m_000000_0' 2011-06-23 12:47:03,054 INFO org.apache.hadoop.mapred.JobTracker: Removed completed task 'attempt_201106231235_0002_m_000000_0' from 'TASK_TRACKER1'
Thanks Sudhan S
On Wed, Jun 22, 2011 at 12:13 PM, Devaraj K <[EMAIL PROTECTED]> wrote:
+
Sudharsan Sampath 2011-06-22, 09:20
|
|