|
|
-
Re: Hadoop cluster hangs on big hive jobHåvard Wahl Kongsgård 2013-03-07, 08:21
hadoop logs?
On 6. mars 2013 21:04, "Daning Wang" <[EMAIL PROTECTED]> wrote: > We have 5 nodes cluster(Hadoop 1.0.4), It hung a couple of times while > running big jobs. Basically all the nodes are dead, from that > trasktracker's log looks it went into some kinds of loop forever. > > All the log entries like this when problem happened. > > Any idea how to debug the issue? > > Thanks in advance. > > > 2013-03-05 15:13:19,526 INFO org.apache.hadoop.mapred.TaskTracker: > attempt_201302270947_0010_r_000012_0 0.131468% reduce > copy (19706 of > 49964 at 0.00 MB/s) > > 2013-03-05 15:13:19,552 INFO org.apache.hadoop.mapred.TaskTracker: > attempt_201302270947_0010_r_000028_0 0.131468% reduce > copy (19706 of > 49964 at 0.00 MB/s) > > 2013-03-05 15:13:20,858 INFO org.apache.hadoop.mapred.TaskTracker: > attempt_201302270947_0010_r_000036_0 0.131468% reduce > copy (19706 of > 49964 at 0.00 MB/s) > > 2013-03-05 15:13:21,141 INFO org.apache.hadoop.mapred.TaskTracker: > attempt_201302270947_0010_r_000016_0 0.131468% reduce > copy (19706 of > 49964 at 0.00 MB/s) > > 2013-03-05 15:13:21,486 INFO org.apache.hadoop.mapred.TaskTracker: > attempt_201302270947_0010_r_000019_0 0.131468% reduce > copy (19706 of > 49964 at 0.00 MB/s) > > 2013-03-05 15:13:21,692 INFO org.apache.hadoop.mapred.TaskTracker: > attempt_201302270947_0010_r_000039_0 0.131468% reduce > copy (19706 of > 49964 at 0.00 MB/s) > > 2013-03-05 15:13:22,448 INFO org.apache.hadoop.mapred.TaskTracker: > attempt_201302270947_0010_r_000032_0 0.131468% reduce > copy (19706 of > 49964 at 0.00 MB/s) > > 2013-03-05 15:13:22,643 INFO org.apache.hadoop.mapred.TaskTracker: > attempt_201302270947_0010_r_000000_0 0.131468% reduce > copy (19706 of > 49964 at 0.00 MB/s) > > 2013-03-05 15:13:22,840 INFO org.apache.hadoop.mapred.TaskTracker: > attempt_201302270947_0010_r_000024_0 0.131468% reduce > copy (19706 of > 49964 at 0.00 MB/s) > > 2013-03-05 15:13:24,628 INFO org.apache.hadoop.mapred.TaskTracker: > attempt_201302270947_0010_r_000008_0 0.131468% reduce > copy (19706 of > 49964 at 0.00 MB/s) > > 2013-03-05 15:13:24,723 INFO org.apache.hadoop.mapred.TaskTracker: > attempt_201302270947_0010_r_000039_0 0.131468% reduce > copy (19706 of > 49964 at 0.00 MB/s) > > 2013-03-05 15:13:25,336 INFO org.apache.hadoop.mapred.TaskTracker: > attempt_201302270947_0010_r_000004_0 0.131468% reduce > copy (19706 of > 49964 at 0.00 MB/s) > > 2013-03-05 15:13:25,539 INFO org.apache.hadoop.mapred.TaskTracker: > attempt_201302270947_0010_r_000043_0 0.131468% reduce > copy (19706 of > 49964 at 0.00 MB/s) > > 2013-03-05 15:13:25,545 INFO org.apache.hadoop.mapred.TaskTracker: > attempt_201302270947_0010_r_000012_0 0.131468% reduce > copy (19706 of > 49964 at 0.00 MB/s) > > 2013-03-05 15:13:25,569 INFO org.apache.hadoop.mapred.TaskTracker: > attempt_201302270947_0010_r_000028_0 0.131468% reduce > copy (19706 of > 49964 at 0.00 MB/s) > > 2013-03-05 15:13:25,855 INFO org.apache.hadoop.mapred.TaskTracker: > attempt_201302270947_0010_r_000024_0 0.131468% reduce > copy (19706 of > 49964 at 0.00 MB/s) > > 2013-03-05 15:13:26,876 INFO org.apache.hadoop.mapred.TaskTracker: > attempt_201302270947_0010_r_000036_0 0.131468% reduce > copy (19706 of > 49964 at 0.00 MB/s) > > 2013-03-05 15:13:27,159 INFO org.apache.hadoop.mapred.TaskTracker: > attempt_201302270947_0010_r_000016_0 0.131468% reduce > copy (19706 of > 49964 at 0.00 MB/s) > > 2013-03-05 15:13:27,505 INFO org.apache.hadoop.mapred.TaskTracker: > attempt_201302270947_0010_r_000019_0 0.131468% reduce > copy (19706 of > 49964 at 0.00 MB/s) > > 2013-03-05 15:13:28,464 INFO org.apache.hadoop.mapred.TaskTracker: > attempt_201302270947_0010_r_000032_0 0.131468% reduce > copy (19706 of > 49964 at 0.00 MB/s) > > 2013-03-05 15:13:28,553 INFO org.apache.hadoop.mapred.TaskTracker: > attempt_201302270947_0010_r_000043_0 0.131468% reduce > copy (19706 of > 49964 at 0.00 MB/s) > > 2013-03-05 15:13:28,561 INFO org.apache.hadoop.mapred.TaskTracker: > attempt_201302270947_0010_r_000012_0 0.131468% reduce > copy (19706 of |