|
|
-
Re: division by zero in getLocalPathForWrite()Steve Loughran 2013-01-14, 12:34
It certainly looks possible -can you file a JIRA issue on the problem?
On 13 January 2013 16:39, Ted Yu <[EMAIL PROTECTED]> wrote: > I found this error again, see > > https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/345/testReport/org.apache.hadoop.hbase.mapreduce/TestImportExport/testSimpleCase/ > > 2013-01-12 11:53:52,809 WARN [AsyncDispatcher event handler] > resourcemanager.RMAuditLogger(255): USER=jenkins > OPERATION=Application > Finished - Failed TARGET=RMAppManager RESULT=FAILURE > DESCRIPTION=App > failed with state: FAILED PERMISSIONS=Application > application_1357991604658_0002 failed 1 times due to AM Container for > appattempt_1357991604658_0002_000001 exited with exitCode: -1000 due > to: java.lang.ArithmeticException: / by zero > at > org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:368) > at > org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:150) > at > org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:131) > at > org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:115) > at > org.apache.hadoop.yarn.server.nodemanager.LocalDirsHandlerService.getLocalPathForWrite(LocalDirsHandlerService.java:279) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:851) > > .Failing this attempt.. Failing the > application. APPID=application_1357991604658_0002 > Here is related code: > > // Keep rolling the wheel till we get a valid path > Random r = new java.util.Random(); > while (numDirsSearched < numDirs && returnPath == null) { > long randomPosition = Math.abs(r.nextLong()) % totalAvailable; > > My guess is that totalAvailable was 0, meaning dirDF was empty. > > Please advise whether that scenario is possible. > > Cheers > > On Tue, Oct 30, 2012 at 9:33 AM, Ted Yu <[EMAIL PROTECTED]> wrote: > > > Thanks for the investigation Kihwal. > > > > I will keep an eye on future test failure in TestRowCounter. > > > > > > On Tue, Oct 30, 2012 at 9:29 AM, Kihwal Lee <[EMAIL PROTECTED]> > wrote: > > > >> Ted, > >> > >> I couldn't reproduce it by just running the test case. When you > reproduce > >> it, look at the stderr/stdout file somewhere under > >> target/org.apache.hadoop.mapred.MiniMRCluster. Look for the one under > the > >> directory whose name containing the app id. > >> > >> I did run into a similar problem and the stderr said: > >> /bin/bash: /bin/java: No such file or directory > >> > >> It was because JAVA_HOME was not set. But in this case the exit code was > >> 127 (shell not being able to locate the command to exec). In the hudson > >> job, the exit code was 1, so I think it's something else. > >> > >> Kihwal > >> > >> On 10/29/12 11:56 PM, "Ted Yu" <[EMAIL PROTECTED]> wrote: > >> > >> >TestRowCounter still fails: > >> > > >> > https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/244/testReport/j > >> > >> > >unit/org.apache.hadoop.hbase.mapreduce/TestRowCounter/testRowCounterNoColu > >> >mn/ > >> > > >> >but there was no 'divide by zero' exception. > >> > > >> >Cheers > >> > > >> >On Thu, Oct 25, 2012 at 8:04 AM, Ted Yu <[EMAIL PROTECTED]> wrote: > >> > > >> >> I will try 2.0.2-alpha release. > >> >> > >> >> Cheers > >> >> > >> >> > >> >> On Thu, Oct 25, 2012 at 7:54 AM, Ted Yu <[EMAIL PROTECTED]> wrote: > >> >> > >> >>> Thanks for the quick response, Robert. > >> >>> Here is the hadoop version being used: > >> >>> <hadoop-two.version>2.0.1-alpha</hadoop-two.version> > >> >>> > >> >>> If there is newer release, I am willing to try that before filing > >> JIRA. > >> >>> > >> >>> > >> >>> On Thu, Oct 25, 2012 at 7:07 AM, Robert Evans > >> >>><[EMAIL PROTECTED]>wrote: > >> >>> > >> >>>> It looks like you are running with an older version of 2.0, even |