|
Jesse Yates
2011-10-03, 20:55
Jonathan Hsieh
2011-10-04, 00:25
Jesse Yates
2011-10-04, 00:30
Doug Meil
2011-10-04, 00:33
Ramkrishna S Vasudevan
2011-10-04, 06:48
Ramkrishna S Vasudevan
2011-10-04, 10:43
Jesse Yates
2011-10-04, 18:33
N Keywal
2011-10-17, 12:30
Jesse Yates
2011-10-17, 17:40
Doug Meil
2011-10-17, 20:05
N Keywal
2011-10-17, 20:20
Stack
2011-10-18, 11:15
N Keywal
2011-10-18, 12:49
N Keywal
2011-10-18, 19:36
Doug Meil
2011-10-18, 20:51
Stack
2011-10-19, 22:45
|
-
Speeding up testsJesse Yates 2011-10-03, 20:55
Hey everyone,
There has been a bunch of work recently on speeding up the testing to make it easier for developers to iterate quickly on new features fixes. Part of the problem is that the test suite takes anywhere from 1-2 hrs to run and have some apparently non-deterministic hanging of tests. TL;DR To speed it all up, the attack plan would be: (1) Move long running tests to be integration tests, (2) Use a build server for patches so people only run unit tests locally, (3) We add unit tests when integration tests are breaking a lot but unit tests pass, (4) Go from forked to single jvm unit tests, (5) Add in surefire parallelization (6) Entertain using HBaseTestingUtilFactory I recently chatted with Stack and Doug about ways around this. Here is what we came up with: (1) Break up long running tests from medium and short (2 mins max) tests and move the former to be IntegrationTests. This was based on Todd's suggestion in HBASE-4438. Either naming them integration tests or long running functional tests, they would become part of the 'mvn verify' rather than the 'mvn test' suite of tests. Starting point would be to use Doug's spreadsheet from HBASE-4448 and when its done, the script from HBASE-4480. Right now, that means a LOT of tests are going to shift, but it means when developers run 'mvn test' the amount of time spent running unit tests will be cut down dramatically (hopefully towards the sub 10 -15 mins range) There is an implicit problem here: if the soon to be integration tests capture functionality that is not covered by unit tests, then people may incorrectly think that they are not breaking things. Therefore, we would do (2) and (3): (2) Add a patch continuous integration server that goes and actually builds and tests patches as they come in. This would run 'mvn verify' and ensure that the patch actually isn't breaking high level/complex functionality. It would be a requirement before patches are committed that they pass this build. (3) If we find that the unit tests aren't covering a certain level of functionality that is constantly breaking on the build server, we add more unit tests of the breaking functionality to ensure the unit tests are more complete and provide more assurances to developers when running them. This would be an ongoing process of comparing the integration tests vs. the unit tests. (4) Once we have a true unit test suite, we should be able to go from 'forked' jvm mode back to a single jvm for running tests. Unit tests should not do crazy fault injection, full failure scenarios, so they should be able to cleanup after themselves. This means we are going to get some speedup from not spinning up a new jvm for each test. (5) Once we are running in non-forked mode, we can try turning on parallelized test execution in surefire, which does parallel runs in a single jvm. (6) Once things all run in a single jvm, using the HBaseTestUtilFactory (HBASE-4448) make sense reuse the mini clusters across tests. What does everyone think of this approach? Thanks! -Jesse Yates
-
Re: Speeding up testsJonathan Hsieh 2011-10-04, 00:25
I'm assuming that the longrunning/integration tests will still run as new
jvm instances? I've been working on hbase offline recovery tools whose tests (in current incarnation) requires hbase cluster shutdown and restart. I assume these would be integration tests. I've also found at least in these cases, it is mini cluster spinup and spindown that is more costly than jvm spin up. I've also found that here is some file handle leakage in the mini cluster utility classes and other possbible leakage due to statics and what such as HConnectionManager which may make using a single jvm infeasible for the long running tests until that gets fixed. Jon. On Mon, Oct 3, 2011 at 1:55 PM, Jesse Yates <[EMAIL PROTECTED]> wrote: > Hey everyone, > > There has been a bunch of work recently on speeding up the testing to make > it easier for developers to iterate quickly on new features fixes. Part of > the problem is that the test suite takes anywhere from 1-2 hrs to run and > have some apparently non-deterministic hanging of tests. > > TL;DR To speed it all up, the attack plan would be: > (1) Move long running tests to be integration tests, > (2) Use a build server for patches so people only run unit tests locally, > (3) We add unit tests when integration tests are breaking a lot but unit > tests pass, > (4) Go from forked to single jvm unit tests, > (5) Add in surefire parallelization > (6) Entertain using HBaseTestingUtilFactory > > I recently chatted with Stack and Doug about ways around this. Here is what > we came up with: > > (1) Break up long running tests from medium and short (2 mins max) tests > and > move the former to be IntegrationTests. This was based on Todd's > suggestion > in HBASE-4438. Either naming them integration tests or long running > functional tests, they would become part of the 'mvn verify' rather than > the > 'mvn test' suite of tests. Starting point would be to use Doug's > spreadsheet from HBASE-4448 and when its done, the script from HBASE-4480. > > Right now, that means a LOT of tests are going to shift, but it means when > developers run 'mvn test' the amount of time spent running unit tests will > be cut down dramatically (hopefully towards the sub 10 -15 mins range) > > There is an implicit problem here: if the soon to be integration tests > capture functionality that is not covered by unit tests, then people may > incorrectly think that they are not breaking things. Therefore, we would do > (2) and (3): > > (2) Add a patch continuous integration server that goes and actually builds > and tests patches as they come in. This would run 'mvn verify' and ensure > that the patch actually isn't breaking high level/complex functionality. It > would be a requirement before patches are committed that they pass this > build. > > (3) If we find that the unit tests aren't covering a certain level of > functionality that is constantly breaking on the build server, we add more > unit tests of the breaking functionality to ensure the unit tests are more > complete and provide more assurances to developers when running them. > > This would be an ongoing process of comparing the integration tests vs. the > unit tests. > > (4) Once we have a true unit test suite, we should be able to go from > 'forked' jvm mode back to a single jvm for running tests. Unit tests should > not do crazy fault injection, full failure scenarios, so they should be > able > to cleanup after themselves. This means we are going to get some speedup > from not spinning up a new jvm for each test. > > (5) Once we are running in non-forked mode, we can try turning on > parallelized test execution in surefire, which does parallel runs in a > single jvm. > > (6) Once things all run in a single jvm, using the HBaseTestUtilFactory > (HBASE-4448) make sense reuse the mini clusters across tests. > > What does everyone think of this approach? > > Thanks! > > -Jesse Yates > -- // Jonathan Hsieh (shay) // Software Engineer, Cloudera // [EMAIL PROTECTED]
-
Re: Speeding up testsJesse Yates 2011-10-04, 00:30
Yeah, they would still run in forked mode. There are also a lot of cases
where we are testing edge failure scenarios and injecting errors into the cluster that really need to be in their own jvm. We are trying to cut down on everything that we can to speed up the build. Agree that a lot of the time is on spin up/down of mini-clusters, so if we can just use one across multiple tests, we can see a big speed up. When we move them into single jvm mode there is definitely going to be some issues, but lets cross that bridge when we come to it :) -Jesse Yates On Mon, Oct 3, 2011 at 5:25 PM, Jonathan Hsieh <[EMAIL PROTECTED]> wrote: > I'm assuming that the longrunning/integration tests will still run as new > jvm instances? > > I've been working on hbase offline recovery tools whose tests (in current > incarnation) requires hbase cluster shutdown and restart. I assume these > would be integration tests. I've also found at least in these cases, it is > mini cluster spinup and spindown that is more costly than jvm spin up. > I've > also found that here is some file handle leakage in the mini cluster > utility > classes and other possbible leakage due to statics and what such as > HConnectionManager which may make using a single jvm infeasible for the > long > running tests until that gets fixed. > > Jon. > > On Mon, Oct 3, 2011 at 1:55 PM, Jesse Yates <[EMAIL PROTECTED]> > wrote: > > > Hey everyone, > > > > There has been a bunch of work recently on speeding up the testing to > make > > it easier for developers to iterate quickly on new features fixes. Part > of > > the problem is that the test suite takes anywhere from 1-2 hrs to run and > > have some apparently non-deterministic hanging of tests. > > > > TL;DR To speed it all up, the attack plan would be: > > (1) Move long running tests to be integration tests, > > (2) Use a build server for patches so people only run unit tests locally, > > (3) We add unit tests when integration tests are breaking a lot but unit > > tests pass, > > (4) Go from forked to single jvm unit tests, > > (5) Add in surefire parallelization > > (6) Entertain using HBaseTestingUtilFactory > > > > I recently chatted with Stack and Doug about ways around this. Here is > what > > we came up with: > > > > (1) Break up long running tests from medium and short (2 mins max) tests > > and > > move the former to be IntegrationTests. This was based on Todd's > > suggestion > > in HBASE-4438. Either naming them integration tests or long running > > functional tests, they would become part of the 'mvn verify' rather than > > the > > 'mvn test' suite of tests. Starting point would be to use Doug's > > spreadsheet from HBASE-4448 and when its done, the script from > HBASE-4480. > > > > Right now, that means a LOT of tests are going to shift, but it means > when > > developers run 'mvn test' the amount of time spent running unit tests > will > > be cut down dramatically (hopefully towards the sub 10 -15 mins range) > > > > There is an implicit problem here: if the soon to be integration tests > > capture functionality that is not covered by unit tests, then people may > > incorrectly think that they are not breaking things. Therefore, we would > do > > (2) and (3): > > > > (2) Add a patch continuous integration server that goes and actually > builds > > and tests patches as they come in. This would run 'mvn verify' and ensure > > that the patch actually isn't breaking high level/complex functionality. > It > > would be a requirement before patches are committed that they pass this > > build. > > > > (3) If we find that the unit tests aren't covering a certain level of > > functionality that is constantly breaking on the build server, we add > more > > unit tests of the breaking functionality to ensure the unit tests are > more > > complete and provide more assurances to developers when running them. > > > > This would be an ongoing process of comparing the integration tests vs. > the > > unit tests. > > > > (4) Once we have a true unit test suite, we should be able to go from
-
Re: Speeding up testsDoug Meil 2011-10-04, 00:33
Thanks Jesse. Great write up!
On 10/3/11 4:55 PM, "Jesse Yates" <[EMAIL PROTECTED]> wrote: >Hey everyone, > >There has been a bunch of work recently on speeding up the testing to make >it easier for developers to iterate quickly on new features fixes. Part of >the problem is that the test suite takes anywhere from 1-2 hrs to run and >have some apparently non-deterministic hanging of tests. > >TL;DR To speed it all up, the attack plan would be: >(1) Move long running tests to be integration tests, >(2) Use a build server for patches so people only run unit tests locally, >(3) We add unit tests when integration tests are breaking a lot but unit >tests pass, >(4) Go from forked to single jvm unit tests, >(5) Add in surefire parallelization >(6) Entertain using HBaseTestingUtilFactory > >I recently chatted with Stack and Doug about ways around this. Here is >what >we came up with: > >(1) Break up long running tests from medium and short (2 mins max) tests >and >move the former to be IntegrationTests. This was based on Todd's >suggestion >in HBASE-4438. Either naming them integration tests or long running >functional tests, they would become part of the 'mvn verify' rather than >the >'mvn test' suite of tests. Starting point would be to use Doug's >spreadsheet from HBASE-4448 and when its done, the script from HBASE-4480. > >Right now, that means a LOT of tests are going to shift, but it means when >developers run 'mvn test' the amount of time spent running unit tests will >be cut down dramatically (hopefully towards the sub 10 -15 mins range) > >There is an implicit problem here: if the soon to be integration tests >capture functionality that is not covered by unit tests, then people may >incorrectly think that they are not breaking things. Therefore, we would >do >(2) and (3): > >(2) Add a patch continuous integration server that goes and actually >builds >and tests patches as they come in. This would run 'mvn verify' and ensure >that the patch actually isn't breaking high level/complex functionality. >It >would be a requirement before patches are committed that they pass this >build. > >(3) If we find that the unit tests aren't covering a certain level of >functionality that is constantly breaking on the build server, we add more >unit tests of the breaking functionality to ensure the unit tests are more >complete and provide more assurances to developers when running them. > >This would be an ongoing process of comparing the integration tests vs. >the >unit tests. > >(4) Once we have a true unit test suite, we should be able to go from >'forked' jvm mode back to a single jvm for running tests. Unit tests >should >not do crazy fault injection, full failure scenarios, so they should be >able >to cleanup after themselves. This means we are going to get some speedup >from not spinning up a new jvm for each test. > >(5) Once we are running in non-forked mode, we can try turning on >parallelized test execution in surefire, which does parallel runs in a >single jvm. > >(6) Once things all run in a single jvm, using the HBaseTestUtilFactory >(HBASE-4448) make sense reuse the mini clusters across tests. > >What does everyone think of this approach? > >Thanks! > >-Jesse Yates
-
RE: Speeding up testsRamkrishna S Vasudevan 2011-10-04, 06:48
Hi Jesse
Thanks for the write up. I am using the script in HBASE-4480 widely. I have a problem Sometimes some test cases gets killed by the maven as it took a long time and those testcases don't have timeout property in them. Now if such testcase dont get completed(hanging happens) then maven kills entirely and we are not able to proceed with other testcases. Could you help me in this? And do you have a list of flaky testcases? I have a list prepared may be you can add on to it. <include>**/TestActiveMasterManager*.java</include> <include>**/TestMasterFailover*.java</include> <include>**/TestMasterRestartAfterDisablingTable*.java</include> <include>**/TestLogsCleaner*.java</include> <include>**/TestRestartCluster*.java</include> <include>**/TestMasterAddressManager*.java</include> <include>**/TestLogRolling*.java</include> <include>**/TestRegionRebalancing*.java</include> <include>**/TestZKTable*.java</include> <include>**/TestZooKeeperNodeTracker*.java</include> <include>**/TestMergeTool*.java</include> <include>**/TestMergeTable*.java</include> <include>**/TestHBaseFsck*.java</include> <include>**/TestThriftServer*.java</include> <include>**/TestFullLogReconstruction*.java</include> <include>**/TestReplicationSink*.java</include> <include>**/TestReplicationSourceManager*.java</include> <include>**/TestMasterReplication*.java</include> <include>**/TestMultiSlaveReplication*.java</include> <include>**/TestSplitLogWorker*.java</include> <include>**/TestSplitTransaction*.java</include> <include>**/TestSplitTransactionOnCluster*.java</include> <include>**/TestRollingRestart*.java</include> <include>**/TestSplitLogManager*.java</include> <include>**/TestAdmin*.java</include> -----Original Message----- From: Doug Meil [mailto:[EMAIL PROTECTED]] Sent: Tuesday, October 04, 2011 6:04 AM To: [EMAIL PROTECTED] Subject: Re: Speeding up tests Thanks Jesse. Great write up! On 10/3/11 4:55 PM, "Jesse Yates" <[EMAIL PROTECTED]> wrote: >Hey everyone, > >There has been a bunch of work recently on speeding up the testing to make >it easier for developers to iterate quickly on new features fixes. Part of >the problem is that the test suite takes anywhere from 1-2 hrs to run and >have some apparently non-deterministic hanging of tests. > >TL;DR To speed it all up, the attack plan would be: >(1) Move long running tests to be integration tests, >(2) Use a build server for patches so people only run unit tests locally, >(3) We add unit tests when integration tests are breaking a lot but unit >tests pass, >(4) Go from forked to single jvm unit tests, >(5) Add in surefire parallelization >(6) Entertain using HBaseTestingUtilFactory > >I recently chatted with Stack and Doug about ways around this. Here is >what >we came up with: > >(1) Break up long running tests from medium and short (2 mins max) tests >and >move the former to be IntegrationTests. This was based on Todd's >suggestion >in HBASE-4438. Either naming them integration tests or long running >functional tests, they would become part of the 'mvn verify' rather than >the >'mvn test' suite of tests. Starting point would be to use Doug's >spreadsheet from HBASE-4448 and when its done, the script from HBASE-4480. > >Right now, that means a LOT of tests are going to shift, but it means when >developers run 'mvn test' the amount of time spent running unit tests will >be cut down dramatically (hopefully towards the sub 10 -15 mins range) > >There is an implicit problem here: if the soon to be integration tests >capture functionality that is not covered by unit tests, then people may >incorrectly think that they are not breaking things. Therefore, we would
-
FW: Speeding up testsRamkrishna S Vasudevan 2011-10-04, 10:43
I think the problem that i mentioned could be due to like i was using maven
2.4. In maven 2.9 this problem will not occur. If a testcase gets killed it will continue with other testcases. Regards Ram -----Original Message----- From: Ramkrishna S Vasudevan [mailto:[EMAIL PROTECTED]] Sent: Tuesday, October 04, 2011 12:18 PM To: [EMAIL PROTECTED] Subject: RE: Speeding up tests Hi Jesse Thanks for the write up. I am using the script in HBASE-4480 widely. I have a problem Sometimes some test cases gets killed by the maven as it took a long time and those testcases don't have timeout property in them. Now if such testcase dont get completed(hanging happens) then maven kills entirely and we are not able to proceed with other testcases. Could you help me in this? And do you have a list of flaky testcases? I have a list prepared may be you can add on to it. <include>**/TestActiveMasterManager*.java</include> <include>**/TestMasterFailover*.java</include> <include>**/TestMasterRestartAfterDisablingTable*.java</include> <include>**/TestLogsCleaner*.java</include> <include>**/TestRestartCluster*.java</include> <include>**/TestMasterAddressManager*.java</include> <include>**/TestLogRolling*.java</include> <include>**/TestRegionRebalancing*.java</include> <include>**/TestZKTable*.java</include> <include>**/TestZooKeeperNodeTracker*.java</include> <include>**/TestMergeTool*.java</include> <include>**/TestMergeTable*.java</include> <include>**/TestHBaseFsck*.java</include> <include>**/TestThriftServer*.java</include> <include>**/TestFullLogReconstruction*.java</include> <include>**/TestReplicationSink*.java</include> <include>**/TestReplicationSourceManager*.java</include> <include>**/TestMasterReplication*.java</include> <include>**/TestMultiSlaveReplication*.java</include> <include>**/TestSplitLogWorker*.java</include> <include>**/TestSplitTransaction*.java</include> <include>**/TestSplitTransactionOnCluster*.java</include> <include>**/TestRollingRestart*.java</include> <include>**/TestSplitLogManager*.java</include> <include>**/TestAdmin*.java</include> -----Original Message----- From: Doug Meil [mailto:[EMAIL PROTECTED]] Sent: Tuesday, October 04, 2011 6:04 AM To: [EMAIL PROTECTED] Subject: Re: Speeding up tests Thanks Jesse. Great write up! On 10/3/11 4:55 PM, "Jesse Yates" <[EMAIL PROTECTED]> wrote: >Hey everyone, > >There has been a bunch of work recently on speeding up the testing to make >it easier for developers to iterate quickly on new features fixes. Part of >the problem is that the test suite takes anywhere from 1-2 hrs to run and >have some apparently non-deterministic hanging of tests. > >TL;DR To speed it all up, the attack plan would be: >(1) Move long running tests to be integration tests, >(2) Use a build server for patches so people only run unit tests locally, >(3) We add unit tests when integration tests are breaking a lot but unit >tests pass, >(4) Go from forked to single jvm unit tests, >(5) Add in surefire parallelization >(6) Entertain using HBaseTestingUtilFactory > >I recently chatted with Stack and Doug about ways around this. Here is >what >we came up with: > >(1) Break up long running tests from medium and short (2 mins max) tests >and >move the former to be IntegrationTests. This was based on Todd's >suggestion >in HBASE-4438. Either naming them integration tests or long running >functional tests, they would become part of the 'mvn verify' rather than >the >'mvn test' suite of tests. Starting point would be to use Doug's >spreadsheet from HBASE-4448 and when its done, the script from HBASE-4480. > >Right now, that means a LOT of tests are going to shift, but it means when
-
Re: FW: Speeding up testsJesse Yates 2011-10-04, 18:33
Thanks for the list Ram! I haven't had a chance to go through and figure out
what tests were hanging regularly (mostly b/c it would take a straight day of consistently running tests), so that is going to be a great start. Also, I don't think its unreasonable to expect all devs to run with Maven 3. At least I wasn't seeing the issue you are describing with 3, but instead it just goes to the next test w/o printing out timeing stats. -Jesse Yates On Tue, Oct 4, 2011 at 3:43 AM, Ramkrishna S Vasudevan < [EMAIL PROTECTED]> wrote: > I think the problem that i mentioned could be due to like i was using maven > 2.4. > In maven 2.9 this problem will not occur. If a testcase gets killed it > will > continue with other testcases. > > Regards > Ram > > -----Original Message----- > From: Ramkrishna S Vasudevan [mailto:[EMAIL PROTECTED]] > Sent: Tuesday, October 04, 2011 12:18 PM > To: [EMAIL PROTECTED] > Subject: RE: Speeding up tests > > Hi Jesse > Thanks for the write up. > I am using the script in HBASE-4480 widely. I have a problem > Sometimes some test cases gets killed by the maven as it took a long time > and those testcases don't have timeout property in them. > > Now if such testcase dont get completed(hanging happens) then maven kills > entirely and we are not able to proceed with other testcases. > Could you help me in this? > > And do you have a list of flaky testcases? I have a list prepared may be > you can add on to it. > <include>**/TestActiveMasterManager*.java</include> > <include>**/TestMasterFailover*.java</include> > <include>**/TestMasterRestartAfterDisablingTable*.java</include> > > <include>**/TestLogsCleaner*.java</include> > <include>**/TestRestartCluster*.java</include> > <include>**/TestMasterAddressManager*.java</include> > <include>**/TestLogRolling*.java</include> > <include>**/TestRegionRebalancing*.java</include> > <include>**/TestZKTable*.java</include> > <include>**/TestZooKeeperNodeTracker*.java</include> > <include>**/TestMergeTool*.java</include> > <include>**/TestMergeTable*.java</include> > <include>**/TestHBaseFsck*.java</include> > <include>**/TestThriftServer*.java</include> > <include>**/TestFullLogReconstruction*.java</include> > <include>**/TestReplicationSink*.java</include> > <include>**/TestReplicationSourceManager*.java</include> > <include>**/TestMasterReplication*.java</include> > <include>**/TestMultiSlaveReplication*.java</include> > <include>**/TestSplitLogWorker*.java</include> > <include>**/TestSplitTransaction*.java</include> > <include>**/TestSplitTransactionOnCluster*.java</include> > <include>**/TestRollingRestart*.java</include> > <include>**/TestSplitLogManager*.java</include> > <include>**/TestAdmin*.java</include> > > > > > > -----Original Message----- > From: Doug Meil [mailto:[EMAIL PROTECTED]] > Sent: Tuesday, October 04, 2011 6:04 AM > To: [EMAIL PROTECTED] > Subject: Re: Speeding up tests > > Thanks Jesse. Great write up! > > > > > On 10/3/11 4:55 PM, "Jesse Yates" <[EMAIL PROTECTED]> wrote: > > >Hey everyone, > > > >There has been a bunch of work recently on speeding up the testing to make > >it easier for developers to iterate quickly on new features fixes. Part of > >the problem is that the test suite takes anywhere from 1-2 hrs to run and > >have some apparently non-deterministic hanging of tests. > > > >TL;DR To speed it all up, the attack plan would be: > >(1) Move long running tests to be integration tests, > >(2) Use a build server for patches so people only run unit tests locally, > >(3) We add unit tests when integration tests are breaking a lot but unit > >tests pass, > >(4) Go from forked to single jvm unit tests, > >(5) Add in surefire parallelization
-
Re: FW: Speeding up testsN Keywal 2011-10-17, 12:30
Hello,
I will be working for a month on the subject, on behalf of StumbleUpon / Stack. The goal is to reduced the build time for developer to a minimum, and at least half of the time needed now (i.e: from two hours -> 1 hour). I created a JIRA to ease the follow up: HBASE-4602. I will put all the future sub-JIRA in this one. I already put the existing ones as "related link". As a start, I extracted the time taken on the apache server today, plus some hints on what the test is doing: the type of cluster used (dfs, zookeeper, hbase, mapreduce), the logs, potential "Thread.sleep". I attach the resulting excel sheet in HBASE-4602, you may want to have a look. BTW, The second sheet contains the script I used for this. Strategy will be mainly: - Cutting down on the number of cluster spinups by coalescing related tests rather than have each spin up its own cluster - Make cluster start/stop faster - Rewriting long-running tests so they do not need to be run on a cluster; e.g. by instead mocking expected signals/messages - Move long running tests out of the unit test suite to instead run as part of the recently introduced 'integration test' step Of course, there will be numerous small JIRAs to avoid any big bang effect. Splitting the tests in unit tests vs. long tests seems quite promising when looking to the excel sheet. Jesse, I understood that you're already working on this? Will you do the split as well? For myself, at the beginning, I will concentrate on cleaning up the tests and improving the start time of the cluster, so you will see some JIRA on this. Then I will look at the "long tests" that we would really like to keep as "unit test". Regards, N.
-
Re: FW: Speeding up testsJesse Yates 2011-10-17, 17:40
Keywal,
Thanks for helping out with this. Yeah, I've started working on breaking out some of the tests from unit to integration tests (see HBASE-4559). Basically, I'm just working from top to bottom on the source tree, trying to pull out integration tests and, when possible, replace some of the testing with unit tests backed by mocking. The unit test version wasn't really possible with 4559 as all the avro stuff would essentially making sure that it makes the one or two calls to a cluster with essentially no transformation. This kind of thing is not really worth unit testing as there is no internal behavior being tested, but would instead just mean mocking all the internals. A unit test would tell you: "yeah, it calls these methods in this order," but it is going to break as soon as any behavior changes in the class under test. The reason I mention the above is I would caution against writing a unit test that does all this internal mocking; it is a false comfort in that the test passes because you made it so, not really because the functionality is truly "correct", meaning it is actuall worse than not having the unit test and just relying on the integration test. That being said, I'm excited to have you help out with this effort. As a way to make sure we don't overlap work, just make sure you add a ticket for split a test/package AND link it to 4438 (the original umbrella ticket) _before_ spending time on doing the extraction. Sound good? -Jesse On Mon, Oct 17, 2011 at 5:30 AM, N Keywal <[EMAIL PROTECTED]> wrote: > Hello, > > I will be working for a month on the subject, on behalf of StumbleUpon / > Stack. The goal is to reduced the build time for developer to a minimum, > and > at least half of the time needed now (i.e: from two hours -> 1 hour). > > I created a JIRA to ease the follow up: HBASE-4602. I will put all the > future sub-JIRA in this one. I already put the existing ones as "related > link". > > As a start, I extracted the time taken on the apache server today, plus > some > hints on what the test is doing: the type of cluster used (dfs, zookeeper, > hbase, mapreduce), the logs, potential "Thread.sleep". I attach the > resulting excel sheet in HBASE-4602, you may want to have a look. BTW, The > second sheet contains the script I used for this. > > Strategy will be mainly: > - Cutting down on the number of cluster spinups by coalescing related tests > rather than have each spin up its own cluster > - Make cluster start/stop faster > - Rewriting long-running tests so they do not need to be run on a cluster; > e.g. by instead mocking expected signals/messages > - Move long running tests out of the unit test suite to instead run as part > of the recently introduced 'integration test' step > > Of course, there will be numerous small JIRAs to avoid any big bang effect. > > Splitting the tests in unit tests vs. long tests seems quite promising when > looking to the excel sheet. Jesse, I understood that you're already working > on this? Will you do the split as well? > > For myself, at the beginning, I will concentrate on cleaning up the tests > and improving the start time of the cluster, so you will see some JIRA on > this. Then I will look at the "long tests" that we would really like to > keep as "unit test". > > > Regards, > > N. >
-
Re: Speeding up testsDoug Meil 2011-10-17, 20:05
+1 on what Jesse said about mocking. I'm not particularly crazy about mocking for the reasons he cited, and I'd use them only as a last resort. On 10/17/11 1:40 PM, "Jesse Yates" <[EMAIL PROTECTED]> wrote: >Keywal, > >Thanks for helping out with this. > >Yeah, I've started working on breaking out some of the tests from unit to >integration tests (see HBASE-4559). Basically, I'm just working from top >to >bottom on the source tree, trying to pull out integration tests and, when >possible, replace some of the testing with unit tests backed by mocking. > >The unit test version wasn't really possible with 4559 as all the avro >stuff >would essentially making sure that it makes the one or two calls to a >cluster with essentially no transformation. This kind of thing is not >really >worth unit testing as there is no internal behavior being tested, but >would >instead just mean mocking all the internals. A unit test would tell you: >"yeah, it calls these methods in this order," but it is going to break as >soon as any behavior changes in the class under test. > >The reason I mention the above is I would caution against writing a unit >test that does all this internal mocking; it is a false comfort in that >the >test passes because you made it so, not really because the functionality >is >truly "correct", meaning it is actuall worse than not having the unit test >and just relying on the integration test. > >That being said, I'm excited to have you help out with this effort. As a >way >to make sure we don't overlap work, just make sure you add a ticket for >split a test/package AND link it to 4438 (the original umbrella ticket) >_before_ spending time on doing the extraction. > >Sound good? > >-Jesse > >On Mon, Oct 17, 2011 at 5:30 AM, N Keywal <[EMAIL PROTECTED]> wrote: > >> Hello, >> >> I will be working for a month on the subject, on behalf of StumbleUpon / >> Stack. The goal is to reduced the build time for developer to a minimum, >> and >> at least half of the time needed now (i.e: from two hours -> 1 hour). >> >> I created a JIRA to ease the follow up: HBASE-4602. I will put all the >> future sub-JIRA in this one. I already put the existing ones as "related >> link". >> >> As a start, I extracted the time taken on the apache server today, plus >> some >> hints on what the test is doing: the type of cluster used (dfs, >>zookeeper, >> hbase, mapreduce), the logs, potential "Thread.sleep". I attach the >> resulting excel sheet in HBASE-4602, you may want to have a look. BTW, >>The >> second sheet contains the script I used for this. >> >> Strategy will be mainly: >> - Cutting down on the number of cluster spinups by coalescing related >>tests >> rather than have each spin up its own cluster >> - Make cluster start/stop faster >> - Rewriting long-running tests so they do not need to be run on a >>cluster; >> e.g. by instead mocking expected signals/messages >> - Move long running tests out of the unit test suite to instead run as >>part >> of the recently introduced 'integration test' step >> >> Of course, there will be numerous small JIRAs to avoid any big bang >>effect. >> >> Splitting the tests in unit tests vs. long tests seems quite promising >>when >> looking to the excel sheet. Jesse, I understood that you're already >>working >> on this? Will you do the split as well? >> >> For myself, at the beginning, I will concentrate on cleaning up the >>tests >> and improving the start time of the cluster, so you will see some JIRA >>on >> this. Then I will look at the "long tests" that we would really like to >> keep as "unit test". >> >> >> Regards, >> >> N. >>
-
Re: FW: Speeding up testsN Keywal 2011-10-17, 20:20
Hi Jesse,
Yes, sounds good! Let me rephrase to be sure I understood you point: you're saying that sometimes the test has a value only if there are some real components behind, and mocking would make the test not very useful? I agree with this, and I will first try to speed up the tests globally, then attack the remaining issues, with mocking as a possible solution but not a silver bullet to use without distinction. I was also thinking about splitting the test 'as they are' between the long running ones & and the others, without changing them. I put a possible list in the column 'P' of the excel sheet in HR-4602. The long running ones list is in mine mind something as: slow test AND (flaky test OR very specialized test OR test interesting only with a third party). A test that fails often (i.e. a good bug catcher) does not fall in this category. This allow to have valuable tests with a set of test that every developer should run before submitting a bug. What do you think? May be it's possible with Jenkins to run the two sets in two JVM in // as well (haven't check yet), so on a build server all the test could be run efficiently as well. This said, on the very short term, I am concentrating on making the MiniCluster starts & stops faster... See HBase-4603 for example. And I will add a link to 4438 before doing any split! Cheers, On Mon, Oct 17, 2011 at 7:40 PM, Jesse Yates <[EMAIL PROTECTED]>wrote: > Keywal, > > Thanks for helping out with this. > > Yeah, I've started working on breaking out some of the tests from unit to > integration tests (see HBASE-4559). Basically, I'm just working from top to > bottom on the source tree, trying to pull out integration tests and, when > possible, replace some of the testing with unit tests backed by mocking. > > The unit test version wasn't really possible with 4559 as all the avro > stuff > would essentially making sure that it makes the one or two calls to a > cluster with essentially no transformation. This kind of thing is not > really > worth unit testing as there is no internal behavior being tested, but would > instead just mean mocking all the internals. A unit test would tell you: > "yeah, it calls these methods in this order," but it is going to break as > soon as any behavior changes in the class under test. > > The reason I mention the above is I would caution against writing a unit > test that does all this internal mocking; it is a false comfort in that the > test passes because you made it so, not really because the functionality is > truly "correct", meaning it is actuall worse than not having the unit test > and just relying on the integration test. > > That being said, I'm excited to have you help out with this effort. As a > way > to make sure we don't overlap work, just make sure you add a ticket for > split a test/package AND link it to 4438 (the original umbrella ticket) > _before_ spending time on doing the extraction. > > Sound good? > > -Jesse > > On Mon, Oct 17, 2011 at 5:30 AM, N Keywal <[EMAIL PROTECTED]> wrote: > > > Hello, > > > > I will be working for a month on the subject, on behalf of StumbleUpon / > > Stack. The goal is to reduced the build time for developer to a minimum, > > and > > at least half of the time needed now (i.e: from two hours -> 1 hour). > > > > I created a JIRA to ease the follow up: HBASE-4602. I will put all the > > future sub-JIRA in this one. I already put the existing ones as "related > > link". > > > > As a start, I extracted the time taken on the apache server today, plus > > some > > hints on what the test is doing: the type of cluster used (dfs, > zookeeper, > > hbase, mapreduce), the logs, potential "Thread.sleep". I attach the > > resulting excel sheet in HBASE-4602, you may want to have a look. BTW, > The > > second sheet contains the script I used for this. > > > > Strategy will be mainly: > > - Cutting down on the number of cluster spinups by coalescing related > tests > > rather than have each spin up its own cluster
-
Re: FW: Speeding up testsStack 2011-10-18, 11:15
On Mon, Oct 17, 2011 at 8:20 PM, N Keywal <[EMAIL PROTECTED]> wrote:
> I was also thinking about splitting the test 'as they are' between the long > running ones & and the others, without changing them. How you thinking of splitting them Nicolas? You mean put them into a separate phase or name them differently? Good stuff, St.Ack
-
Re: FW: Speeding up testsN Keywal 2011-10-18, 12:49
Yes, putting then in a different phase. Actually , it could be done by
renaming them, as surefire allows to launch a set of tests selected by a pattern (not tested :-). Tests like replication could fall into this approach, as it takes a lot of time, depends on the underlying cluster, and should not be impacted by most of the changes in HBase. Nicolas On Tue, Oct 18, 2011 at 1:15 PM, Stack <[EMAIL PROTECTED]> wrote: > On Mon, Oct 17, 2011 at 8:20 PM, N Keywal <[EMAIL PROTECTED]> wrote: > > I was also thinking about splitting the test 'as they are' between the > long > > running ones & and the others, without changing them. > > How you thinking of splitting them Nicolas? You mean put them into a > separate phase or name them differently? > > Good stuff, > St.Ack >
-
Re: FW: Speeding up testsN Keywal 2011-10-18, 19:36
Hi;
I've identified a few sleep/wait related stuff, at the end it should cut the start/stop time in half. I am waiting a little to see if it can be generalized or not and I will publish them. There are also possible improvements in the test themselves, small bug fixes stuff, but it should buy a few minutes as well. If I do things in a pure & clean way, that's gonna create a big big bunch of very very small JIRAs. Is that ok, or do you prefer medium sized ones? Then I will have a look at the profiler results. Who knows :-) The next task could be defining the strategy for the split. I tend to believe that we can categorize the tests between: - core unit tests: with a contract: should be run before submitting a patch, should not take too long, should not be flaky, should cover most of HBase - long running tests: long or flaky, not on the core of HBase, or testing specific cases (like timeout) However, a failure of a test should break the central build, whatever the category. I have not made my mind at all, just that this seems an easy way to make progress. Cheers, On Tue, Oct 18, 2011 at 2:49 PM, N Keywal <[EMAIL PROTECTED]> wrote: > Yes, putting then in a different phase. Actually , it could be done by > renaming them, as surefire allows to launch a set of tests selected by a > pattern (not tested :-). > > Tests like replication could fall into this approach, as it takes a lot of > time, depends on the underlying cluster, and should not be impacted by most > of the changes in HBase. > > Nicolas > > > > On Tue, Oct 18, 2011 at 1:15 PM, Stack <[EMAIL PROTECTED]> wrote: > >> On Mon, Oct 17, 2011 at 8:20 PM, N Keywal <[EMAIL PROTECTED]> wrote: >> > I was also thinking about splitting the test 'as they are' between the >> long >> > running ones & and the others, without changing them. >> >> How you thinking of splitting them Nicolas? You mean put them into a >> separate phase or name them differently? >> >> Good stuff, >> St.Ack >> > >
-
Re: Speeding up testsDoug Meil 2011-10-18, 20:51
Excellent! On 10/18/11 3:36 PM, "N Keywal" <[EMAIL PROTECTED]> wrote: >I've identified a few sleep/wait related stuff, at the end it should cut >the >start/stop time in half
-
Re: FW: Speeding up testsStack 2011-10-19, 22:45
On Tue, Oct 18, 2011 at 12:36 PM, N Keywal <[EMAIL PROTECTED]> wrote:
> If I do things in a pure & clean way, that's gonna create a big big bunch of > very very small JIRAs. Is that ok, or do you prefer medium sized ones? > Medium-sized sounds better. Group as best you can. Doesn't have to be pure when it comes to test refactoring I'd say. > Then I will have a look at the profiler results. Who knows :-) > Little time has been spent around profiling start/stop I'd say. Its mostly message passing with wait loops and we probably do it the dumbest way possible (smile). > The next task could be defining the strategy for the split. I tend to > believe that we can categorize the tests between: > - core unit tests: with a contract: should be run before submitting a patch, > should not take too long, should not be flaky, should cover most of HBase > - long running tests: long or flaky, not on the core of HBase, or testing > specific cases (like timeout) > Sounds good. Flakey tests need to be fixed. Its good fingering them as flakey but would like to shutdown a test soon after it identified (I don't think this part of the speed-up tests task -- its a separate endeavor). Good stuff. St.Ack |