Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # dev >> division by zero in getLocalPathForWrite()


Copy link to this message
-
Re: division by zero in getLocalPathForWrite()
It certainly looks possible -can you file a JIRA issue on the problem?

On 13 January 2013 16:39, Ted Yu <[EMAIL PROTECTED]> wrote:

> I found this error again, see
>
> https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/345/testReport/org.apache.hadoop.hbase.mapreduce/TestImportExport/testSimpleCase/
>
> 2013-01-12 11:53:52,809 WARN  [AsyncDispatcher event handler]
> resourcemanager.RMAuditLogger(255): USER=jenkins
>  OPERATION=Application
> Finished - Failed       TARGET=RMAppManager     RESULT=FAILURE
>  DESCRIPTION=App
> failed with state: FAILED       PERMISSIONS=Application
> application_1357991604658_0002 failed 1 times due to AM Container for
> appattempt_1357991604658_0002_000001 exited with  exitCode: -1000 due
> to: java.lang.ArithmeticException: / by zero
>         at
> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:368)
>         at
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:150)
>         at
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:131)
>         at
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:115)
>         at
> org.apache.hadoop.yarn.server.nodemanager.LocalDirsHandlerService.getLocalPathForWrite(LocalDirsHandlerService.java:279)
>         at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:851)
>
> .Failing this attempt.. Failing the
> application.    APPID=application_1357991604658_0002
> Here is related code:
>
>         // Keep rolling the wheel till we get a valid path
>         Random r = new java.util.Random();
>         while (numDirsSearched < numDirs && returnPath == null) {
>           long randomPosition = Math.abs(r.nextLong()) % totalAvailable;
>
> My guess is that totalAvailable was 0, meaning dirDF was empty.
>
> Please advise whether that scenario is possible.
>
> Cheers
>
> On Tue, Oct 30, 2012 at 9:33 AM, Ted Yu <[EMAIL PROTECTED]> wrote:
>
> > Thanks for the investigation Kihwal.
> >
> > I will keep an eye on future test failure in TestRowCounter.
> >
> >
> > On Tue, Oct 30, 2012 at 9:29 AM, Kihwal Lee <[EMAIL PROTECTED]>
> wrote:
> >
> >> Ted,
> >>
> >> I couldn't reproduce it by just running the test case. When you
> reproduce
> >> it, look at the stderr/stdout file somewhere under
> >> target/org.apache.hadoop.mapred.MiniMRCluster. Look for the one under
> the
> >> directory whose name containing the app id.
> >>
> >> I did run into a similar problem and the stderr said:
> >> /bin/bash: /bin/java: No such file or directory
> >>
> >> It was because JAVA_HOME was not set. But in this case the exit code was
> >> 127 (shell not being able to locate the command to exec). In the hudson
> >> job, the exit code was 1, so I think it's something else.
> >>
> >> Kihwal
> >>
> >> On 10/29/12 11:56 PM, "Ted Yu" <[EMAIL PROTECTED]> wrote:
> >>
> >> >TestRowCounter still fails:
> >> >
> >>
> https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/244/testReport/j
> >>
> >>
> >unit/org.apache.hadoop.hbase.mapreduce/TestRowCounter/testRowCounterNoColu
> >> >mn/
> >> >
> >> >but there was no 'divide by zero' exception.
> >> >
> >> >Cheers
> >> >
> >> >On Thu, Oct 25, 2012 at 8:04 AM, Ted Yu <[EMAIL PROTECTED]> wrote:
> >> >
> >> >> I will try 2.0.2-alpha release.
> >> >>
> >> >> Cheers
> >> >>
> >> >>
> >> >> On Thu, Oct 25, 2012 at 7:54 AM, Ted Yu <[EMAIL PROTECTED]> wrote:
> >> >>
> >> >>> Thanks for the quick response, Robert.
> >> >>> Here is the hadoop version being used:
> >> >>>     <hadoop-two.version>2.0.1-alpha</hadoop-two.version>
> >> >>>
> >> >>> If there is newer release, I am willing to try that before filing
> >> JIRA.
> >> >>>
> >> >>>
> >> >>> On Thu, Oct 25, 2012 at 7:07 AM, Robert Evans
> >> >>><[EMAIL PROTECTED]>wrote:
> >> >>>
> >> >>>> It looks like you are running with an older version of 2.0, even