|
lars hofhansl
2013-01-24, 04:03
Ted Yu
2013-01-24, 04:07
Ted Yu
2013-01-24, 04:34
lars hofhansl
2013-01-24, 05:26
lars hofhansl
2013-01-25, 07:16
Ted Yu
2013-01-25, 15:11
lars hofhansl
2013-01-25, 22:02
Sergey Shelukhin
2013-01-25, 22:37
lars hofhansl
2013-01-26, 06:21
|
-
recent 0.94 failureslars hofhansl 2013-01-24, 04:03
https://builds.apache.org/job/HBase-0.94/
Prime suspects are: HBASE-7599 (Devaraj), and HBASE-7638 (Sergey). If anybody has any ideas. Otherwise I'll start with reverting these changes. -- Lars
-
Re: recent 0.94 failuresTed Yu 2013-01-24, 04:07
Lars:
Here is what I put in HBASE-7638: Sergey and I looked at the patch. There is no potential for NullPointerException similar to what HBASE-7268 addendum fixes. See deleteCachedLocation(): {code} if (oldLocation != null) { isStaleDelete = (source != null) && !oldLocation.equals(source); {code} I also ran the tests that failed in recent 0.94 builds and they all passed: 1041 mt -Dtest=TestLruBlockCache,TestMiniClusterLoadParallel 1042 mt -Dtest=TestLruBlockCache 1043 mt -Dtest=TestCompactionState 1044 mt -Dtest=TestRSKilledWhenMasterInitializing I would also loop the above tests to see if I can get test failure. I understand it is important to have a green 0.94 build. So whether / what to roll back is up to you. Cheers On Wed, Jan 23, 2013 at 8:03 PM, lars hofhansl <[EMAIL PROTECTED]> wrote: > https://builds.apache.org/job/HBase-0.94/ > > > Prime suspects are: HBASE-7599 (Devaraj), and HBASE-7638 (Sergey). > If anybody has any ideas. > > Otherwise I'll start with reverting these changes. > > -- Lars >
-
Re: recent 0.94 failuresTed Yu 2013-01-24, 04:34
I ran the tests 4 rounds and they all passed:
1046 ~/runtest.sh 4 TestLruBlockCache,TestMiniClusterLoadParallel,TestLruBlockCache,TestCompactionState,TestRSKilledWhenMasterInitializing FYI On Wed, Jan 23, 2013 at 8:07 PM, Ted Yu <[EMAIL PROTECTED]> wrote: > Lars: > Here is what I put in HBASE-7638: > > Sergey and I looked at the patch. > There is no potential for NullPointerException similar to what HBASE-7268 > addendum fixes. > See deleteCachedLocation(): > {code} > if (oldLocation != null) { > isStaleDelete = (source != null) && > !oldLocation.equals(source); > {code} > I also ran the tests that failed in recent 0.94 builds and they all passed: > > 1041 mt -Dtest=TestLruBlockCache,TestMiniClusterLoadParallel > 1042 mt -Dtest=TestLruBlockCache > 1043 mt -Dtest=TestCompactionState > 1044 mt -Dtest=TestRSKilledWhenMasterInitializing > > I would also loop the above tests to see if I can get test failure. > > I understand it is important to have a green 0.94 build. So whether / what > to roll back is up to you. > > Cheers > > > On Wed, Jan 23, 2013 at 8:03 PM, lars hofhansl <[EMAIL PROTECTED]> wrote: > >> https://builds.apache.org/job/HBase-0.94/ >> >> >> Prime suspects are: HBASE-7599 (Devaraj), and HBASE-7638 (Sergey). >> If anybody has any ideas. >> >> Otherwise I'll start with reverting these changes. >> >> -- Lars >> > >
-
Re: recent 0.94 failureslars hofhansl 2013-01-24, 05:26
Hmm... Also got a successful run now.
Maybe it was a temporary env issue. It is just strange that the same test would fail twice in a row suddenly, along with other test that have not failed in a while. Looking at the runtime of TestMiniClusterLoadParallel on Ubuntu1 it tooK 104s. In the latest run on Ubuntu5 it took 292s. In the failed runs it over 500s. -- Lars ________________________________ From: Ted Yu <[EMAIL PROTECTED]> To: [EMAIL PROTECTED]; lars hofhansl <[EMAIL PROTECTED]> Sent: Wednesday, January 23, 2013 8:34 PM Subject: Re: recent 0.94 failures I ran the tests 4 rounds and they all passed: 1046 ~/runtest.sh 4 TestLruBlockCache,TestMiniClusterLoadParallel,TestLruBlockCache,TestCompactionState,TestRSKilledWhenMasterInitializing FYI On Wed, Jan 23, 2013 at 8:07 PM, Ted Yu <[EMAIL PROTECTED]> wrote: > Lars: > Here is what I put in HBASE-7638: > > Sergey and I looked at the patch. > There is no potential for NullPointerException similar to what HBASE-7268 > addendum fixes. > See deleteCachedLocation(): > {code} > if (oldLocation != null) { > isStaleDelete = (source != null) && > !oldLocation.equals(source); > {code} > I also ran the tests that failed in recent 0.94 builds and they all passed: > > 1041 mt -Dtest=TestLruBlockCache,TestMiniClusterLoadParallel > 1042 mt -Dtest=TestLruBlockCache > 1043 mt -Dtest=TestCompactionState > 1044 mt -Dtest=TestRSKilledWhenMasterInitializing > > I would also loop the above tests to see if I can get test failure. > > I understand it is important to have a green 0.94 build. So whether / what > to roll back is up to you. > > Cheers > > > On Wed, Jan 23, 2013 at 8:03 PM, lars hofhansl <[EMAIL PROTECTED]> wrote: > >> https://builds.apache.org/job/HBase-0.94/ >> >> >> Prime suspects are: HBASE-7599 (Devaraj), and HBASE-7638 (Sergey). >> If anybody has any ideas. >> >> Otherwise I'll start with reverting these changes. >> >> -- Lars >> > >
-
Re: recent 0.94 failureslars hofhansl 2013-01-25, 07:16
Got a lot of failed tests that I have not seen failing at before.
It looks like the test VMs collectively got slower. Testtimes are up from ~45mins to ~70mins Lots the recent failures are because of tests timing out. -- Lars ________________________________ From: lars hofhansl <[EMAIL PROTECTED]> To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> Sent: Wednesday, January 23, 2013 9:26 PM Subject: Re: recent 0.94 failures Hmm... Also got a successful run now. Maybe it was a temporary env issue. It is just strange that the same test would fail twice in a row suddenly, along with other test that have not failed in a while. Looking at the runtime of TestMiniClusterLoadParallel on Ubuntu1 it tooK 104s. In the latest run on Ubuntu5 it took 292s. In the failed runs it over 500s. -- Lars ________________________________ From: Ted Yu <[EMAIL PROTECTED]> To: [EMAIL PROTECTED]; lars hofhansl <[EMAIL PROTECTED]> Sent: Wednesday, January 23, 2013 8:34 PM Subject: Re: recent 0.94 failures I ran the tests 4 rounds and they all passed: 1046 ~/runtest.sh 4 TestLruBlockCache,TestMiniClusterLoadParallel,TestLruBlockCache,TestCompactionState,TestRSKilledWhenMasterInitializing FYI On Wed, Jan 23, 2013 at 8:07 PM, Ted Yu <[EMAIL PROTECTED]> wrote: > Lars: > Here is what I put in HBASE-7638: > > Sergey and I looked at the patch. > There is no potential for NullPointerException similar to what HBASE-7268 > addendum fixes. > See deleteCachedLocation(): > {code} > if (oldLocation != null) { > isStaleDelete = (source != null) && > !oldLocation.equals(source); > {code} > I also ran the tests that failed in recent 0.94 builds and they all passed: > > 1041 mt -Dtest=TestLruBlockCache,TestMiniClusterLoadParallel > 1042 mt -Dtest=TestLruBlockCache > 1043 mt -Dtest=TestCompactionState > 1044 mt -Dtest=TestRSKilledWhenMasterInitializing > > I would also loop the above tests to see if I can get test failure. > > I understand it is important to have a green 0.94 build. So whether / what > to roll back is up to you. > > Cheers > > > On Wed, Jan 23, 2013 at 8:03 PM, lars hofhansl <[EMAIL PROTECTED]> wrote: > >> https://builds.apache.org/job/HBase-0.94/ >> >> >> Prime suspects are: HBASE-7599 (Devaraj), and HBASE-7638 (Sergey). >> If anybody has any ideas. >> >> Otherwise I'll start with reverting these changes. >> >> -- Lars >> > >
-
Re: recent 0.94 failuresTed Yu 2013-01-25, 15:11
Looking at https://builds.apache.org/job/HBase-0.94/771/console :
[INFO] BUILD SUCCESS [INFO] ------------------------------------------------------------------------ [INFO] Total time: 44:01.553s FYI On Thu, Jan 24, 2013 at 11:16 PM, lars hofhansl <[EMAIL PROTECTED]> wrote: > Got a lot of failed tests that I have not seen failing at before. > It looks like the test VMs collectively got slower. Testtimes are up from > ~45mins to ~70mins > > Lots the recent failures are because of tests timing out. > > > -- Lars > > > > ________________________________ > From: lars hofhansl <[EMAIL PROTECTED]> > To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> > Sent: Wednesday, January 23, 2013 9:26 PM > Subject: Re: recent 0.94 failures > > Hmm... Also got a successful run now. > Maybe it was a temporary env issue. It is just strange that the same test > would fail twice in a row suddenly, along with other test that have not > failed in a while. > > Looking at the runtime of TestMiniClusterLoadParallel on Ubuntu1 it tooK > 104s. In the latest run on Ubuntu5 it took 292s. > In the failed runs it over 500s. > > -- Lars > ________________________________ > From: Ted Yu <[EMAIL PROTECTED]> > To: [EMAIL PROTECTED]; lars hofhansl <[EMAIL PROTECTED]> > Sent: Wednesday, January 23, 2013 8:34 PM > Subject: Re: recent 0.94 failures > > I ran the tests 4 rounds and they all passed: > 1046 ~/runtest.sh 4 > > TestLruBlockCache,TestMiniClusterLoadParallel,TestLruBlockCache,TestCompactionState,TestRSKilledWhenMasterInitializing > > FYI > > On Wed, Jan 23, 2013 at 8:07 PM, Ted Yu <[EMAIL PROTECTED]> wrote: > > > Lars: > > Here is what I put in HBASE-7638: > > > > Sergey and I looked at the patch. > > There is no potential for NullPointerException similar to what HBASE-7268 > > addendum fixes. > > See deleteCachedLocation(): > > {code} > > if (oldLocation != null) { > > isStaleDelete = (source != null) && > > !oldLocation.equals(source); > > {code} > > I also ran the tests that failed in recent 0.94 builds and they all > passed: > > > > 1041 mt -Dtest=TestLruBlockCache,TestMiniClusterLoadParallel > > 1042 mt -Dtest=TestLruBlockCache > > 1043 mt -Dtest=TestCompactionState > > 1044 mt -Dtest=TestRSKilledWhenMasterInitializing > > > > I would also loop the above tests to see if I can get test failure. > > > > I understand it is important to have a green 0.94 build. So whether / > what > > to roll back is up to you. > > > > Cheers > > > > > > On Wed, Jan 23, 2013 at 8:03 PM, lars hofhansl <[EMAIL PROTECTED]> wrote: > > > >> https://builds.apache.org/job/HBase-0.94/ > >> > >> > >> Prime suspects are: HBASE-7599 (Devaraj), and HBASE-7638 (Sergey). > >> If anybody has any ideas. > >> > >> Otherwise I'll start with reverting these changes. > >> > >> -- Lars > >> > > > > >
-
Re: recent 0.94 failureslars hofhansl 2013-01-25, 22:02
More failures. Once TestSplitTransactionOnCluster didn't finish. In the last run TestHBaseFsck did not finish.
-- Lars ________________________________ From: Ted Yu <[EMAIL PROTECTED]> To: [EMAIL PROTECTED]; lars hofhansl <[EMAIL PROTECTED]> Sent: Friday, January 25, 2013 7:11 AM Subject: Re: recent 0.94 failures Looking at https://builds.apache.org/job/HBase-0.94/771/console : [INFO] BUILD SUCCESS [INFO] ------------------------------------------------------------------------ [INFO] Total time: 44:01.553s FYI On Thu, Jan 24, 2013 at 11:16 PM, lars hofhansl <[EMAIL PROTECTED]> wrote: Got a lot of failed tests that I have not seen failing at before. >It looks like the test VMs collectively got slower. Testtimes are up from ~45mins to ~70mins > >Lots the recent failures are because of tests timing out. > > >-- Lars > > > >________________________________ > From: lars hofhansl <[EMAIL PROTECTED]> >To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> >Sent: Wednesday, January 23, 2013 9:26 PM > >Subject: Re: recent 0.94 failures > >Hmm... Also got a successful run now. >Maybe it was a temporary env issue. It is just strange that the same test would fail twice in a row suddenly, along with other test that have not failed in a while. > >Looking at the runtime of TestMiniClusterLoadParallel on Ubuntu1 it tooK 104s. In the latest run on Ubuntu5 it took 292s. >In the failed runs it over 500s. > >-- Lars >________________________________ >From: Ted Yu <[EMAIL PROTECTED]> >To: [EMAIL PROTECTED]; lars hofhansl <[EMAIL PROTECTED]> >Sent: Wednesday, January 23, 2013 8:34 PM >Subject: Re: recent 0.94 failures > >I ran the tests 4 rounds and they all passed: >1046 ~/runtest.sh 4 >TestLruBlockCache,TestMiniClusterLoadParallel,TestLruBlockCache,TestCompactionState,TestRSKilledWhenMasterInitializing > >FYI > >On Wed, Jan 23, 2013 at 8:07 PM, Ted Yu <[EMAIL PROTECTED]> wrote: > >> Lars: >> Here is what I put in HBASE-7638: >> >> Sergey and I looked at the patch. >> There is no potential for NullPointerException similar to what HBASE-7268 >> addendum fixes. >> See deleteCachedLocation(): >> {code} >> if (oldLocation != null) { >> isStaleDelete = (source != null) && >> !oldLocation.equals(source); >> {code} >> I also ran the tests that failed in recent 0.94 builds and they all passed: >> >> 1041 mt -Dtest=TestLruBlockCache,TestMiniClusterLoadParallel >> 1042 mt -Dtest=TestLruBlockCache >> 1043 mt -Dtest=TestCompactionState >> 1044 mt -Dtest=TestRSKilledWhenMasterInitializing >> >> I would also loop the above tests to see if I can get test failure. >> >> I understand it is important to have a green 0.94 build. So whether / what >> to roll back is up to you. >> >> Cheers >> >> >> On Wed, Jan 23, 2013 at 8:03 PM, lars hofhansl <[EMAIL PROTECTED]> wrote: >> >>> https://builds.apache.org/job/HBase-0.94/ >>> >>> >>> Prime suspects are: HBASE-7599 (Devaraj), and HBASE-7638 (Sergey). >>> If anybody has any ideas. >>> >>> Otherwise I'll start with reverting these changes. >>> >>> -- Lars >>> >> >>
-
Re: recent 0.94 failuresSergey Shelukhin 2013-01-25, 22:37
I see some timeout failures of trunk too. May these be produced by the
same cause? On Fri, Jan 25, 2013 at 2:02 PM, lars hofhansl <[EMAIL PROTECTED]> wrote: > More failures. Once TestSplitTransactionOnCluster didn't finish. In the last run TestHBaseFsck did not finish. > > -- Lars > > > > ________________________________ > From: Ted Yu <[EMAIL PROTECTED]> > To: [EMAIL PROTECTED]; lars hofhansl <[EMAIL PROTECTED]> > Sent: Friday, January 25, 2013 7:11 AM > Subject: Re: recent 0.94 failures > > > Looking at https://builds.apache.org/job/HBase-0.94/771/console : > > [INFO] BUILD SUCCESS > [INFO] ------------------------------------------------------------------------ > [INFO] Total time: 44:01.553s > > FYI > > On Thu, Jan 24, 2013 at 11:16 PM, lars hofhansl <[EMAIL PROTECTED]> wrote: > > Got a lot of failed tests that I have not seen failing at before. >>It looks like the test VMs collectively got slower. Testtimes are up from ~45mins to ~70mins >> >>Lots the recent failures are because of tests timing out. >> >> >>-- Lars >> >> >> >>________________________________ >> From: lars hofhansl <[EMAIL PROTECTED]> >>To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> >>Sent: Wednesday, January 23, 2013 9:26 PM >> >>Subject: Re: recent 0.94 failures >> >>Hmm... Also got a successful run now. >>Maybe it was a temporary env issue. It is just strange that the same test would fail twice in a row suddenly, along with other test that have not failed in a while. >> >>Looking at the runtime of TestMiniClusterLoadParallel on Ubuntu1 it tooK 104s. In the latest run on Ubuntu5 it took 292s. >>In the failed runs it over 500s. >> >>-- Lars >>________________________________ >>From: Ted Yu <[EMAIL PROTECTED]> >>To: [EMAIL PROTECTED]; lars hofhansl <[EMAIL PROTECTED]> >>Sent: Wednesday, January 23, 2013 8:34 PM >>Subject: Re: recent 0.94 failures >> >>I ran the tests 4 rounds and they all passed: >>1046 ~/runtest.sh 4 >>TestLruBlockCache,TestMiniClusterLoadParallel,TestLruBlockCache,TestCompactionState,TestRSKilledWhenMasterInitializing >> >>FYI >> >>On Wed, Jan 23, 2013 at 8:07 PM, Ted Yu <[EMAIL PROTECTED]> wrote: >> >>> Lars: >>> Here is what I put in HBASE-7638: >>> >>> Sergey and I looked at the patch. >>> There is no potential for NullPointerException similar to what HBASE-7268 >>> addendum fixes. >>> See deleteCachedLocation(): >>> {code} >>> if (oldLocation != null) { >>> isStaleDelete = (source != null) && >>> !oldLocation.equals(source); >>> {code} >>> I also ran the tests that failed in recent 0.94 builds and they all passed: >>> >>> 1041 mt -Dtest=TestLruBlockCache,TestMiniClusterLoadParallel >>> 1042 mt -Dtest=TestLruBlockCache >>> 1043 mt -Dtest=TestCompactionState >>> 1044 mt -Dtest=TestRSKilledWhenMasterInitializing >>> >>> I would also loop the above tests to see if I can get test failure. >>> >>> I understand it is important to have a green 0.94 build. So whether / what >>> to roll back is up to you. >>> >>> Cheers >>> >>> >>> On Wed, Jan 23, 2013 at 8:03 PM, lars hofhansl <[EMAIL PROTECTED]> wrote: >>> >>>> https://builds.apache.org/job/HBase-0.94/ >>>> >>>> >>>> Prime suspects are: HBASE-7599 (Devaraj), and HBASE-7638 (Sergey). >>>> If anybody has any ideas. >>>> >>>> Otherwise I'll start with reverting these changes. >>>> >>>> -- Lars >>>> >>> >>>
-
Re: recent 0.94 failureslars hofhansl 2013-01-26, 06:21
I would also note that these failures are qualitative different from what I have seen previously:
- The tests failing are seemingly random - I have run some of these failing tests in a loop for hours, but have not seen any failures locally Most tests I looked at failed because of some reliance on wall clock time (test times out, or waits in a loop for something to happen). It almost seems like the build VMs suddenly introduce almost arbitrary wait times. -- Lars ________________________________ From: Sergey Shelukhin <[EMAIL PROTECTED]> To: [EMAIL PROTECTED]; lars hofhansl <[EMAIL PROTECTED]> Cc: Ted Yu <[EMAIL PROTECTED]> Sent: Friday, January 25, 2013 2:37 PM Subject: Re: recent 0.94 failures I see some timeout failures of trunk too. May these be produced by the same cause? On Fri, Jan 25, 2013 at 2:02 PM, lars hofhansl <[EMAIL PROTECTED]> wrote: > More failures. Once TestSplitTransactionOnCluster didn't finish. In the last run TestHBaseFsck did not finish. > > -- Lars > > > > ________________________________ > From: Ted Yu <[EMAIL PROTECTED]> > To: [EMAIL PROTECTED]; lars hofhansl <[EMAIL PROTECTED]> > Sent: Friday, January 25, 2013 7:11 AM > Subject: Re: recent 0.94 failures > > > Looking at https://builds.apache.org/job/HBase-0.94/771/console : > > [INFO] BUILD SUCCESS > [INFO] ------------------------------------------------------------------------ > [INFO] Total time: 44:01.553s > > FYI > > On Thu, Jan 24, 2013 at 11:16 PM, lars hofhansl <[EMAIL PROTECTED]> wrote: > > Got a lot of failed tests that I have not seen failing at before. >>It looks like the test VMs collectively got slower. Testtimes are up from ~45mins to ~70mins >> >>Lots the recent failures are because of tests timing out. >> >> >>-- Lars >> >> >> >>________________________________ >> From: lars hofhansl <[EMAIL PROTECTED]> >>To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> >>Sent: Wednesday, January 23, 2013 9:26 PM >> >>Subject: Re: recent 0.94 failures >> >>Hmm... Also got a successful run now. >>Maybe it was a temporary env issue. It is just strange that the same test would fail twice in a row suddenly, along with other test that have not failed in a while. >> >>Looking at the runtime of TestMiniClusterLoadParallel on Ubuntu1 it tooK 104s. In the latest run on Ubuntu5 it took 292s. >>In the failed runs it over 500s. >> >>-- Lars >>________________________________ >>From: Ted Yu <[EMAIL PROTECTED]> >>To: [EMAIL PROTECTED]; lars hofhansl <[EMAIL PROTECTED]> >>Sent: Wednesday, January 23, 2013 8:34 PM >>Subject: Re: recent 0.94 failures >> >>I ran the tests 4 rounds and they all passed: >>1046 ~/runtest.sh 4 >>TestLruBlockCache,TestMiniClusterLoadParallel,TestLruBlockCache,TestCompactionState,TestRSKilledWhenMasterInitializing >> >>FYI >> >>On Wed, Jan 23, 2013 at 8:07 PM, Ted Yu <[EMAIL PROTECTED]> wrote: >> >>> Lars: >>> Here is what I put in HBASE-7638: >>> >>> Sergey and I looked at the patch. >>> There is no potential for NullPointerException similar to what HBASE-7268 >>> addendum fixes. >>> See deleteCachedLocation(): >>> {code} >>> if (oldLocation != null) { >>> isStaleDelete = (source != null) && >>> !oldLocation.equals(source); >>> {code} >>> I also ran the tests that failed in recent 0.94 builds and they all passed: >>> >>> 1041 mt -Dtest=TestLruBlockCache,TestMiniClusterLoadParallel >>> 1042 mt -Dtest=TestLruBlockCache >>> 1043 mt -Dtest=TestCompactionState >>> 1044 mt -Dtest=TestRSKilledWhenMasterInitializing >>> >>> I would also loop the above tests to see if I can get test failure. >>> >>> I understand it is important to have a green 0.94 build. So whether / what >>> to roll back is up to you. >>> >>> Cheers >>> >>> >>> On Wed, Jan 23, 2013 at 8:03 PM, lars hofhansl <[EMAIL PROTECTED]> wrote: >>> >>>> https://builds.apache.org/job/HBase-0.94/ >>>> >>>> >>>> Prime suspects are: HBASE-7599 (Devaraj), and HBASE-7638 (Sergey). >>>> If anybody has any ideas. >>>> >>>> Otherwise I'll start with reverting these changes. |