Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # dev >> Speeding up tests


Copy link to this message
-
Re: FW: Speeding up tests
Keywal,

Thanks for helping out with this.

Yeah, I've started working on breaking out some of the tests from unit to
integration tests (see HBASE-4559). Basically, I'm just working from top to
bottom on the source tree, trying to pull out integration tests and, when
possible, replace some of the testing with unit tests backed by mocking.

The unit test version wasn't really possible with 4559 as all the avro stuff
would essentially making sure that it makes the one or two calls to a
cluster with essentially no transformation. This kind of thing is not really
worth unit testing as there is no internal behavior being tested, but would
instead just mean mocking all the internals. A unit test would tell you:
"yeah, it calls these methods in this order," but it is going to break as
soon as any behavior changes in the class under test.

The reason I mention the above is I would caution against writing a unit
test that does all this internal mocking; it is a false comfort in that the
test passes because you made it so, not really because the functionality is
truly "correct", meaning it is actuall worse than not having the unit test
and just relying on the integration test.

That being said, I'm excited to have you help out with this effort. As a way
to make sure we don't overlap work, just make sure you add a ticket for
split a test/package AND link it to 4438 (the original umbrella ticket)
_before_ spending time on doing the extraction.

Sound good?

-Jesse

On Mon, Oct 17, 2011 at 5:30 AM, N Keywal <[EMAIL PROTECTED]> wrote:

> Hello,
>
> I will be working for a month on the subject, on behalf of StumbleUpon /
> Stack. The goal is to reduced the build time for developer to a minimum,
> and
> at least half of the time needed now (i.e: from two hours -> 1 hour).
>
> I created a JIRA to ease the follow up: HBASE-4602. I will put all the
> future sub-JIRA in this one. I already put the existing ones as "related
> link".
>
> As a start, I extracted the time taken on the apache server today, plus
> some
> hints on what the test is doing: the type of cluster used (dfs, zookeeper,
> hbase, mapreduce), the logs, potential "Thread.sleep". I attach the
> resulting excel sheet in HBASE-4602, you may want to have a look. BTW, The
> second sheet contains the script I used for this.
>
> Strategy will be mainly:
> - Cutting down on the number of cluster spinups by coalescing related tests
> rather than have each spin up its own cluster
> - Make cluster start/stop faster
> - Rewriting long-running tests so they do not need to be run on a cluster;
> e.g. by instead mocking expected signals/messages
> - Move long running tests out of the unit test suite to instead run as part
> of the recently introduced 'integration test' step
>
> Of course, there will be numerous small JIRAs to avoid any big bang effect.
>
> Splitting the tests in unit tests vs. long tests seems quite promising when
> looking to the excel sheet. Jesse, I understood that you're already working
> on this? Will you do the split as well?
>
> For myself, at the beginning, I will concentrate on cleaning up the tests
> and improving the start time of the cluster, so you will see some JIRA on
> this. Then I will look at  the "long tests" that we would really like to
> keep as "unit test".
>
>
> Regards,
>
> N.
>