|
|
+
Aaron Zimmerman 2012-12-26, 15:14
-
Re: INFO org.apache.hadoop.mapred.TaskTracker: attempt_XXXX NaN%Vinod Kumar Vavilapalli 2012-12-26, 19:21
The NaN is very suspicious, perhaps a bug - will need more information But irrespective, are you sending periodic updates from your map/reduce code? The framework has the 10 minute timeout to avoid hung tasks, so the user code can report progress via the Reporter interface and avoid the task-failures. HTH, +Vinod Kumar Vavilapalli Hortonworks Inc. http://hortonworks.com/ On Dec 26, 2012, at 7:14 AM, Aaron Zimmerman wrote: > Hi, > I'm new to hadoop, setting up a new cluster on hadoop 1.0.3 that currently > only has 2 datanode/tasktrackers. I'll be adding more soon, but I'm worried > about something being configured incorrectly. When I run a moderately > expensive map reduce job (via pig), the job usually fails (though it does > succeed 1/8 times or so). > > ERROR 2997: Unable to recreate exception from backed error: Task > attempt_201212171952_0406_m_000020_3 failed to report status for 601 > seconds. Killing! > > Any time a job runs on the cluster, both task tracker logs output line after > line of > INFO org.apache.hadoop.mapred.TaskTracker: > attempt_201212171952_0411_m_000000_0 NaN%, with different attempt > identifiers. > > Interspersed with these entries are lines like, > org.apache.hadoop.mapred.TaskTracker: attempt_201212171952_0411_r_000000_0 > 0.1851852% reduce > copy (5 of 9 at 0.00 MB/s) > > > Which makes it look to me like some of the tasks are working, but some of > the tasks just stall out, and perhaps they eventually timeout the entire > job? > > So maybe my job is just to labor intensive for the cluster, but the task > tracker log entry seems odd, like something is wrong. Why would it say > NaN%? I know that I can extend the timeout allotment, but I'd rather not do > that as a permanent solution. Is there any other config that I could > update? Has anyone seen that task tracker line before? I can't find > anything about it via Google, etc. > > Thanks, > > Aaron Zimmerman > > +
Aaron Zimmerman 2012-12-26, 19:47
|