|
|
Adarsh Sharma 2010-12-08, 12:17
Dear all,
Did anyone encounter the below error while running job in Hadoop. It occurs in the reduce phase of the job.
attempt_201012061426_0001_m_000292_0: org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any valid local directory for taskTracker/jobcache/job_201012061426_0001/attempt_201012061426_0001_m_000292_0/output/file.out
It states that it is not able to locate a file that is created in mapred.local.dir of Hadoop. Thanks in Advance for any sort of information regarding this.
Best Regards
Adarsh Sharma
Any chance mapred.local.dir is under /tmp and part of it got cleaned up ?
On Wed, Dec 8, 2010 at 4:17 AM, Adarsh Sharma <[EMAIL PROTECTED]>wrote:
> Dear all, > > Did anyone encounter the below error while running job in Hadoop. It occurs > in the reduce phase of the job. > > attempt_201012061426_0001_m_000292_0: > org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any > valid local directory for > taskTracker/jobcache/job_201012061426_0001/attempt_201012061426_0001_m_000292_0/output/file.out > > It states that it is not able to locate a file that is created in > mapred.local.dir of Hadoop. > > Thanks in Advance for any sort of information regarding this. > > Best Regards > > Adarsh Sharma >
Adarsh Sharma 2010-12-09, 03:48
Ted Yu wrote: > Any chance mapred.local.dir is under /tmp and part of it got cleaned up ? > > On Wed, Dec 8, 2010 at 4:17 AM, Adarsh Sharma <[EMAIL PROTECTED]>wrote: > > >> Dear all, >> >> Did anyone encounter the below error while running job in Hadoop. It occurs >> in the reduce phase of the job. >> >> attempt_201012061426_0001_m_000292_0: >> org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any >> valid local directory for >> taskTracker/jobcache/job_201012061426_0001/attempt_201012061426_0001_m_000292_0/output/file.out >> >> It states that it is not able to locate a file that is created in >> mapred.local.dir of Hadoop. >> >> Thanks in Advance for any sort of information regarding this. >> >> Best Regards >> >> Adarsh Sharma >> >> > > Hi Ted,
My mapred.local.dir is in /home/hadoop directory. I also check it with in /hdd2-2 directory where we have lots of space.
Would mapred.map.tasks affects.
I checked with default and also with 80 maps and 16 reduces as I have 8 slaves. <property> <name>mapred.local.dir</name> <value>/home/hadoop/mapred/local</value> <description>The local directory where MapReduce stores intermediate data files. May be a comma-separated list of directories on different devices in order to spread disk i/o. Directories that do not exist are ignored. </description> </property>
<property> <name>mapred.system.dir</name> <value>/home/hadoop/mapred/system</value> <description>The shared directory where MapReduce stores control files. </description> </property>
Any further information u want. Thanks & Regards
Adarsh Sharma
Go through the jobtracker, find the relevant node that handled attempt_201012061426_0001_m_000292_0 and figure out
if there are FS or permssion problems.
Raj ________________________________ From: Adarsh Sharma <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Sent: Wed, December 8, 2010 7:48:47 PM Subject: Re: Reduce Error
Ted Yu wrote: > Any chance mapred.local.dir is under /tmp and part of it got cleaned up ? > > On Wed, Dec 8, 2010 at 4:17 AM, Adarsh Sharma <[EMAIL PROTECTED]>wrote: > > >> Dear all, >> >> Did anyone encounter the below error while running job in Hadoop. It occurs >> in the reduce phase of the job. >> >> attempt_201012061426_0001_m_000292_0: >> org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any >> valid local directory for >>taskTracker/jobcache/job_201012061426_0001/attempt_201012061426_0001_m_000292_0/output/file.out >>t >> >> It states that it is not able to locate a file that is created in >> mapred.local.dir of Hadoop. >> >> Thanks in Advance for any sort of information regarding this. >> >> Best Regards >> >> Adarsh Sharma >> >> > > Hi Ted,
My mapred.local.dir is in /home/hadoop directory. I also check it with in /hdd2-2 directory where we have lots of space.
Would mapred.map.tasks affects.
I checked with default and also with 80 maps and 16 reduces as I have 8 slaves. <property> <name>mapred.local.dir</name> <value>/home/hadoop/mapred/local</value> <description>The local directory where MapReduce stores intermediate data files. May be a comma-separated list of directories on different devices in order to spread disk i/o. Directories that do not exist are ignored. </description> </property>
<property> <name>mapred.system.dir</name> <value>/home/hadoop/mapred/system</value> <description>The shared directory where MapReduce stores control files. </description> </property>
Any further information u want. Thanks & Regards
Adarsh Sharma
Adarsh Sharma 2010-12-09, 04:21
Raj V wrote: > Go through the jobtracker, find the relevant node that handled > attempt_201012061426_0001_m_000292_0 and figure out > > if there are FS or permssion problems. > > Raj > > > ________________________________ > From: Adarsh Sharma <[EMAIL PROTECTED]> > To: [EMAIL PROTECTED] > Sent: Wed, December 8, 2010 7:48:47 PM > Subject: Re: Reduce Error > > Ted Yu wrote: > >> Any chance mapred.local.dir is under /tmp and part of it got cleaned up ? >> >> On Wed, Dec 8, 2010 at 4:17 AM, Adarsh Sharma <[EMAIL PROTECTED]>wrote: >> >> >> >>> Dear all, >>> >>> Did anyone encounter the below error while running job in Hadoop. It occurs >>> in the reduce phase of the job. >>> >>> attempt_201012061426_0001_m_000292_0: >>> org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any >>> valid local directory for >>> taskTracker/jobcache/job_201012061426_0001/attempt_201012061426_0001_m_000292_0/output/file.out >>> t >>> >>> It states that it is not able to locate a file that is created in >>> mapred.local.dir of Hadoop. >>> >>> Thanks in Advance for any sort of information regarding this. >>> >>> Best Regards >>> >>> Adarsh Sharma >>> >>> >>> >> >> > Hi Ted, > > My mapred.local.dir is in /home/hadoop directory. I also check it with in > /hdd2-2 directory where we have lots of space. > > Would mapred.map.tasks affects. > > I checked with default and also with 80 maps and 16 reduces as I have 8 slaves. > > > <property> > <name>mapred.local.dir</name> > <value>/home/hadoop/mapred/local</value> > <description>The local directory where MapReduce stores intermediate > data files. May be a comma-separated list of directories on different devices > in order to spread disk i/o. > Directories that do not exist are ignored. > </description> > </property> > > <property> > <name>mapred.system.dir</name> > <value>/home/hadoop/mapred/system</value> > <description>The shared directory where MapReduce stores control files. > </description> > </property> > > Any further information u want. > > > Thanks & Regards > > Adarsh Sharma > Sir I read the tasktracker logs several times but not able to find any reason as they are not very useful. I attached with the mail of tasktracker. However I listed main portion. 2010-12-06 15:27:04,228 INFO org.apache.hadoop.mapred.JobTracker: Adding task 'attempt_201012061426_0001_m_000000_1' to tip task_201012061426_0001_m_000000, for tracker 'tracker_ws37-user-lin:127.0.0.1/127.0.0.1:60583' 2010-12-06 15:27:04,228 INFO org.apache.hadoop.mapred.JobInProgress: Choosing rack-local task task_201012061426_0001_m_000000 2010-12-06 15:27:04,229 INFO org.apache.hadoop.mapred.JobTracker: Removed completed task 'attempt_201012061426_0001_m_000000_0' from 'tracker_ws37-user-lin:127.0.0.1/127.0.0.1:60583' 2010-12-06 15:27:07,235 INFO org.apache.hadoop.mapred.TaskInProgress: Error from attempt_201012061426_0001_m_000328_0: java.io.IOException: Spill failed at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:860) at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:541) at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) at org.apache.hadoop.examples.WordCount$TokenizerMapper.map(WordCount.java:30) at org.apache.hadoop.examples.WordCount$TokenizerMapper.map(WordCount.java:19) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305) at org.apache.hadoop.mapred.Child.main(Child.java:170) Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any valid local directory for taskTracker/jobcache/job_201012061426_0001/attempt_201012061426_0001_m_000328_0/output/spill16.out at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:343) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:124) at org.apache.hadoop.mapred.MapOutputFile.getSpillFileForWrite(MapOutputFile.java:107) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1221) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1800(MapTask.java:686) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1173)
2010-12-06 15:27:07,236 INFO org.apache.hadoop.mapred.TaskInProgress: Error from attempt_201012061426_0001_m_000000_1: Error initializing attempt_201012061426_0001_m_000000_1: org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any valid local directory for taskTracker/jobcache/job_201012061426_0001/job.xml at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:343) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:124) at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:750) at org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:1664) at org.apache.hadoop.mapred.TaskTracker.access$1200(TaskTracker.java:97) at org.apache.hadoop.mapred.TaskTracker$TaskLauncher.run(TaskTracker.java:1629)
Thanks & Regards
Adarsh Sharma
>From Raj earlier:
I have seen this error from time to time and it has been either due to space or missing directories or disk errors.
Space issue was caused by the fact that the I had mounted /de/sdc on /hadoop-dsk and the mount had failed. And in another case I had
accidentally deleted hadoop.tmp.dir in a node and whenever the reduce job was scheduled on that node that attempt would fail.
On Wed, Dec 8, 2010 at 8:21 PM, Adarsh Sharma <[EMAIL PROTECTED]>wrote:
> Raj V wrote: > >> Go through the jobtracker, find the relevant node that handled >> attempt_201012061426_0001_m_000292_0 and figure out >> if there are FS or permssion problems. >> >> Raj >> >> >> ________________________________ >> From: Adarsh Sharma <[EMAIL PROTECTED]> >> To: [EMAIL PROTECTED] >> Sent: Wed, December 8, 2010 7:48:47 PM >> Subject: Re: Reduce Error >> >> >> Ted Yu wrote: >> >> >>> Any chance mapred.local.dir is under /tmp and part of it got cleaned up ? >>> >>> On Wed, Dec 8, 2010 at 4:17 AM, Adarsh Sharma <[EMAIL PROTECTED] >>> >wrote: >>> >>> >>> >>>> Dear all, >>>> >>>> Did anyone encounter the below error while running job in Hadoop. It >>>> occurs >>>> in the reduce phase of the job. >>>> >>>> attempt_201012061426_0001_m_000292_0: >>>> org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find >>>> any >>>> valid local directory for >>>> >>>> taskTracker/jobcache/job_201012061426_0001/attempt_201012061426_0001_m_000292_0/output/file.out >>>> t >>>> >>>> It states that it is not able to locate a file that is created in >>>> mapred.local.dir of Hadoop. >>>> >>>> Thanks in Advance for any sort of information regarding this. >>>> >>>> Best Regards >>>> >>>> Adarsh Sharma >>>> >>>> >>>> >>> >>> >> Hi Ted, >> >> My mapred.local.dir is in /home/hadoop directory. I also check it with in >> /hdd2-2 directory where we have lots of space. >> >> Would mapred.map.tasks affects. >> >> I checked with default and also with 80 maps and 16 reduces as I have 8 >> slaves. >> >> >> <property> >> <name>mapred.local.dir</name> >> <value>/home/hadoop/mapred/local</value> >> <description>The local directory where MapReduce stores intermediate >> data files. May be a comma-separated list of directories on different >> devices in order to spread disk i/o. >> Directories that do not exist are ignored. >> </description> >> </property> >> >> <property> >> <name>mapred.system.dir</name> >> <value>/home/hadoop/mapred/system</value> >> <description>The shared directory where MapReduce stores control files. >> </description> >> </property> >> >> Any further information u want. >> >> >> Thanks & Regards >> >> Adarsh Sharma >> >> > Sir I read the tasktracker logs several times but not able to find any > reason as they are not very useful. I attached with the mail of tasktracker. > However I listed main portion. > 2010-12-06 15:27:04,228 INFO org.apache.hadoop.mapred.JobTracker: Adding > task 'attempt_201012061426_0001_m_000000_1' to tip > task_201012061426_0001_m_000000, for tracker 'tracker_ws37-user-lin: > 127.0.0.1/127.0.0.1:60583' > 2010-12-06 15:27:04,228 INFO org.apache.hadoop.mapred.JobInProgress: > Choosing rack-local task task_201012061426_0001_m_000000 > 2010-12-06 15:27:04,229 INFO org.apache.hadoop.mapred.JobTracker: Removed > completed task 'attempt_201012061426_0001_m_000000_0' from > 'tracker_ws37-user-lin:127.0.0.1/127.0.0.1:60583' > 2010-12-06 15:27:07,235 INFO org.apache.hadoop.mapred.TaskInProgress: Error > from attempt_201012061426_0001_m_000328_0: java.io.IOException: Spill failed > at > org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:860) > at > org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:541) > at > org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) > at > org.apache.hadoop.examples.WordCount$TokenizerMapper.map(WordCount.java:30) > at > org.apache.hadoop.examples.WordCount$TokenizerMapper.map(WordCount.java:19)
Adarsh Sharma 2010-12-09, 08:55
Ted Yu wrote: > From Raj earlier: > > I have seen this error from time to time and it has been either due to space > or > missing directories or disk errors. > > Space issue was caused by the fact that the I had mounted /de/sdc on > /hadoop-dsk > and the mount had failed. And in another case I had > > accidentally deleted hadoop.tmp.dir in a node and whenever the reduce job > was > scheduled on that node that attempt would fail. > > On Wed, Dec 8, 2010 at 8:21 PM, Adarsh Sharma <[EMAIL PROTECTED]>wrote: > > >> Raj V wrote: >> >> >>> Go through the jobtracker, find the relevant node that handled >>> attempt_201012061426_0001_m_000292_0 and figure out >>> if there are FS or permssion problems. >>> >>> Raj >>> >>> >>> ________________________________ >>> From: Adarsh Sharma <[EMAIL PROTECTED]> >>> To: [EMAIL PROTECTED] >>> Sent: Wed, December 8, 2010 7:48:47 PM >>> Subject: Re: Reduce Error >>> >>> >>> Ted Yu wrote: >>> >>> >>> >>>> Any chance mapred.local.dir is under /tmp and part of it got cleaned up ? >>>> >>>> On Wed, Dec 8, 2010 at 4:17 AM, Adarsh Sharma <[EMAIL PROTECTED] >>>> >>>>> wrote: >>>>> >>>> >>>> >>>>> Dear all, >>>>> >>>>> Did anyone encounter the below error while running job in Hadoop. It >>>>> occurs >>>>> in the reduce phase of the job. >>>>> >>>>> attempt_201012061426_0001_m_000292_0: >>>>> org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find >>>>> any >>>>> valid local directory for >>>>> >>>>> taskTracker/jobcache/job_201012061426_0001/attempt_201012061426_0001_m_000292_0/output/file.out >>>>> t >>>>> >>>>> It states that it is not able to locate a file that is created in >>>>> mapred.local.dir of Hadoop. >>>>> >>>>> Thanks in Advance for any sort of information regarding this. >>>>> >>>>> Best Regards >>>>> >>>>> Adarsh Sharma >>>>> >>>>> >>>>> >>>>> >>>> >>> Hi Ted, >>> >>> My mapred.local.dir is in /home/hadoop directory. I also check it with in >>> /hdd2-2 directory where we have lots of space. >>> >>> Would mapred.map.tasks affects. >>> >>> I checked with default and also with 80 maps and 16 reduces as I have 8 >>> slaves. >>> >>> >>> <property> >>> <name>mapred.local.dir</name> >>> <value>/home/hadoop/mapred/local</value> >>> <description>The local directory where MapReduce stores intermediate >>> data files. May be a comma-separated list of directories on different >>> devices in order to spread disk i/o. >>> Directories that do not exist are ignored. >>> </description> >>> </property> >>> >>> <property> >>> <name>mapred.system.dir</name> >>> <value>/home/hadoop/mapred/system</value> >>> <description>The shared directory where MapReduce stores control files. >>> </description> >>> </property> >>> >>> Any further information u want. >>> >>> >>> Thanks & Regards >>> >>> Adarsh Sharma >>> >>> >>> >> Sir I read the tasktracker logs several times but not able to find any >> reason as they are not very useful. I attached with the mail of tasktracker. >> However I listed main portion. >> 2010-12-06 15:27:04,228 INFO org.apache.hadoop.mapred.JobTracker: Adding >> task 'attempt_201012061426_0001_m_000000_1' to tip >> task_201012061426_0001_m_000000, for tracker 'tracker_ws37-user-lin: >> 127.0.0.1/127.0.0.1:60583' >> 2010-12-06 15:27:04,228 INFO org.apache.hadoop.mapred.JobInProgress: >> Choosing rack-local task task_201012061426_0001_m_000000 >> 2010-12-06 15:27:04,229 INFO org.apache.hadoop.mapred.JobTracker: Removed >> completed task 'attempt_201012061426_0001_m_000000_0' from >> 'tracker_ws37-user-lin:127.0.0.1/127.0.0.1:60583' >> 2010-12-06 15:27:07,235 INFO org.apache.hadoop.mapred.TaskInProgress: Error >> from attempt_201012061426_0001_m_000328_0: java.io.IOException: Spill failed >> at >> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:860) >> at >> org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:541) >> at >> org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) Thanks to all for your replies but I fix this issue by setting the below property to /hdd-1/tmp
This error occurs due to less space in mapred.child.tmp directory.
<property> <name>mapred.child.tmp</name> <value>./tmp</value> <description> To set the value of tmp directory for map and reduce tasks. If the value is an absolute path, it is directly assigned. Otherwise, it is prepended with task's working directory. The java tasks are executed with option -Djava.io.tmpdir='the absolute path of the tmp dir'. Pipes and streaming are set with environment variable, TMPDIR='the absolute path of the tmp dir' </description> </property> With Best Regards Adarsh Sharma
|
|