Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce, mail # dev - division by zero in getLocalPathForWrite()


Copy link to this message
-
Re: division by zero in getLocalPathForWrite()
Ted Yu 2012-10-30, 04:56
TestRowCounter still fails:
https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/244/testReport/junit/org.apache.hadoop.hbase.mapreduce/TestRowCounter/testRowCounterNoColumn/

but there was no 'divide by zero' exception.

Cheers

On Thu, Oct 25, 2012 at 8:04 AM, Ted Yu <[EMAIL PROTECTED]> wrote:

> I will try 2.0.2-alpha release.
>
> Cheers
>
>
> On Thu, Oct 25, 2012 at 7:54 AM, Ted Yu <[EMAIL PROTECTED]> wrote:
>
>> Thanks for the quick response, Robert.
>> Here is the hadoop version being used:
>>     <hadoop-two.version>2.0.1-alpha</hadoop-two.version>
>>
>> If there is newer release, I am willing to try that before filing JIRA.
>>
>>
>> On Thu, Oct 25, 2012 at 7:07 AM, Robert Evans <[EMAIL PROTECTED]>wrote:
>>
>>> It looks like you are running with an older version of 2.0, even though
>>> it
>>> does not really make much of a difference in this case,  The issue shows
>>> up when getLocalPathForWrite thinks there is no space on to write to on
>>> any of the disks it has configured.  This could be because you do not
>>> have
>>> any directories configured.  I really don't know for sure exactly what is
>>> happening.  It might be disk fail in place removing disks for you because
>>> of other issues. Either way we should file a JIRA against Hadoop to make
>>> it so we never get the / by zero error and provide a better way to handle
>>> the possible causes.
>>>
>>> --Bobby Evans
>>>
>>> On 10/24/12 11:54 PM, "Ted Yu" <[EMAIL PROTECTED]> wrote:
>>>
>>> >Hi,
>>> >HBase has Jenkins build against hadoop 2.0
>>> >I was checking why TestRowCounter sometimes failed:
>>> >
>>> https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/231/testReport/o
>>>
>>> >rg.apache.hadoop.hbase.mapreduce/TestRowCounter/testRowCounterExclusiveCol
>>> >umn/
>>> >
>>> >I think the following could be the cause:
>>> >
>>> >2012-10-22 23:46:32,571 WARN  [AsyncDispatcher event handler]
>>> >resourcemanager.RMAuditLogger(255): USER=jenkins
>>> OPERATION=Application
>>> >Finished - Failed      TARGET=RMAppManager     RESULT=FAILURE
>>>  DESCRIPTION=App
>>> >failed with state: FAILED      PERMISSIONS=Application
>>> >application_1350949562159_0002 failed 1 times due to AM Container for
>>> >appattempt_1350949562159_0002_000001 exited with  exitCode: -1000 due
>>> >to: java.lang.ArithmeticException: / by zero
>>> >       at
>>>
>>> >org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathFor
>>> >Write(LocalDirAllocator.java:355)
>>> >       at
>>>
>>> >org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAlloca
>>> >tor.java:150)
>>> >       at
>>>
>>> >org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAlloca
>>> >tor.java:131)
>>> >       at
>>>
>>> >org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAlloca
>>> >tor.java:115)
>>> >       at
>>>
>>> >org.apache.hadoop.yarn.server.nodemanager.LocalDirsHandlerService.getLocal
>>> >PathForWrite(LocalDirsHandlerService.java:257)
>>> >       at
>>>
>>> >org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.Resou
>>>
>>> >rceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.jav
>>> >a:849)
>>> >
>>> >However, I don't seem to find where in getLocalPathForWrite() division
>>> by
>>> >zero could have arisen.
>>> >
>>> >Comment / hint is welcome.
>>> >
>>> >Thanks
>>>
>>>
>>
>