Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # dev - Re: Build failed in Jenkins: HBase-TRUNK #3498


Copy link to this message
-
Re: Build failed in Jenkins: HBase-0.94 #589
ramkrishna vasudevan 2012-11-17, 05:04
@Lars @Ted

If any of the newly added tests in TestSplitTransactionOnCluster if
continuously failing i can take a look at them tonight.

Regards
Ram

On Sat, Nov 17, 2012 at 6:56 AM, lars hofhansl <[EMAIL PROTECTED]> wrote:

> I "fixed" the top three failing tests (mostly just race conditions with
> bad timeouts). "Fixed" is in quotes, because just looked at the point where
> the tests failed and made a better guess about how long to wait. Waiting
> some fixed amount of time is almost always bad in tests (unless it is a
> long wait as a safety guard to ensure the test will eventually end), but
> that was the fastest avenue to get them to pass.
>
>
> Will also look at the other tests. I think we should get to 50% pass rate
> of the jenkins build soon, and then to at least 80% pass rate.
> This might mean disabling some of the bad tests.
>
>
> -- Lars
>
>
>
> ________________________________
>  From: lars hofhansl <[EMAIL PROTECTED]>
> To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
> Sent: Wednesday, November 14, 2012 12:40 PM
> Subject: Re: Build failed in Jenkins: HBase-0.94 #589
>
> Agreed. Looking at TestSplitLogManager.testUnassignedTimeout.
> It has some pretty tight timeouts (500ms), which is likely to be a problem
> on slow (or overloaded) build machines.
> I'm doubling the timeouts.
>
>
> Anyway, the current run just got past these too tests, let's hope there're
> no other failures.
> Then we can tackle these tests in 0.94.4 and 0.96.
>
> -- Lars
>
>
> ----- Original Message -----
> From: Ted Yu <[EMAIL PROTECTED]>
> To: [EMAIL PROTECTED]; lars hofhansl <[EMAIL PROTECTED]>
> Cc:
> Sent: Wednesday, November 14, 2012 12:30 PM
> Subject: Re: Build failed in Jenkins: HBase-0.94 #589
>
> I noticed the test failures in TestSplitTransactionOnCluster
>
> 0.94.3 has fix for region splitting issue. I think we should pay a little
> attention fixing TestSplitTransactionOnCluster so that it passes more
> often.
>
> Cheers
>
> On Wed, Nov 14, 2012 at 12:18 PM, lars hofhansl <[EMAIL PROTECTED]>
> wrote:
>
> > Here're the test that failed recently without a fix:
> >
> >
> > TestSplitLogManager.testUnassignedTimeout x 3
> > TestSplitLogManager.testMultipleResubmits
> > TestSplitTransactionOnCluster.testShutdownFixupWhenDaughterHasSplit x 2
> > TestSplitTransactionOnCluster.testMasterRestartWhenSplittingIsPartial
> >
> >
> TestSplitTransactionOnCluster.testShouldThrowIOExceptionIfStoreFileSizeIsEmptyAndSHouldSuccessfullyExecuteRollback
> > TestCatalogTrackerOnCluster.testBadOriginalRootLocation
> > TestDistributedLogSplitting.testDelayedDeleteOnFailure
> > TestScannerTimeout.test3686a
> > TestReplication.testVerifyRepJob
> > TestReplication.queueFailover
> > TestFromClientSideWithCoprocessor.testPoolBehavior
> > TestColumnSeeking.testDuplicateVersions
> >
> >
> >
> > Based on that at least TestSplitLogManager.testUnassignedTimeout should
> > get the axe (or be investigated)
> >
> > -- Lars
> >
> > ----- Original Message -----
> > From: Jimmy Xiang <[EMAIL PROTECTED]>
> > To: [EMAIL PROTECTED]; lars hofhansl <[EMAIL PROTECTED]>
> > Cc:
> > Sent: Wednesday, November 14, 2012 12:12 PM
> > Subject: Re: Build failed in Jenkins: HBase-0.94 #589
> >
> > I agree. +1
> >
> > We can keep a list of flaky tests so that we can fix them later on.
> >
> > Thanks,
> > Jimmy
> >
> > On Wed, Nov 14, 2012 at 11:55 AM, lars hofhansl <[EMAIL PROTECTED]>
> > wrote:
> > > Sigh.
> > >
> > > It seems we're back at having a successful build being the exception
> > rather than the rule.
> > > In this case it was some(?) timeout (all tests ran and passed), but
> many
> > previous runs had at least one test failing.
> > >
> > > Flaky tests are useless. They do not add confidence to a run, and worse
> > they add noise, which requires us to manually filter the good from the
> bad
> > runs but looking at the results.
> > >
> > > There was talk about separating the flaky tests from the good ones.
> > > Short term I propose to disable or remove every test that failed more