Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # dev >> Performance Testing


Copy link to this message
-
Re: Performance Testing
just brainstorming =)

Some of those are motivated by the performance tests i wrote for data block
encoding: Link<https://github.com/hotpads/hbase-prefix-trie/tree/master/test/org/apache/hadoop/hbase/cell/pt/test/performance/seek>.
 In that directory:

* SeekBenchmarkMain gathers all of the test parameters.  Perhaps we could
have a test configuration input file format where standard test configs are
put in source control
* For each combination of input parameters it runs a SingleSeekBenchmark
* As it runs, the SingleSeekBenchmark adds results to a SeekBenchmarkResult
* Each SeekBenchmarkResult is logged after each SingleSeekBenchmark, and
all of them are logged again at the end for pasting into a spreadsheet

They're probably too customized to my use case, but maybe we can draw ideas
from the structure/workflow and make it applicable to more use cases.
On Thu, Jun 21, 2012 at 2:47 PM, Andrew Purtell <[EMAIL PROTECTED]> wrote:

> Concur. That's ambitious!
>
> On Thu, Jun 21, 2012 at 1:57 PM, Ryan Ausanka-Crues
> <[EMAIL PROTECTED]> wrote:
> > Thanks Matt. These are great!
> > ---
> > Ryan Ausanka-Crues
> > CEO
> > Palomino Labs, Inc.
> > [EMAIL PROTECTED]
> > (m) 805.242.2486
> >
> > On Jun 21, 2012, at 12:36 PM, Matt Corgan wrote:
> >
> >> These are geared more towards development than regression testing, but
> here
> >> are a few ideas that I would find useful:
> >>
> >> * Ability to run the performance tests (or at least a subset of them)
> on a
> >> development machine would help people avoid committing regressions and
> >> would speed development in general
> >> * Ability to test a single region without heavier weight servers and
> >> clusters
> >> * Letting the test run with multiple combinations of input parameters
> >> (block size, compression, blooms, encoding, flush size, etc, etc).
> >> Possibly many combinations that could take a while to run
> >> * Output results to a CSV file that's importable to a spreadsheet for
> >> sorting/filtering/charting.
> >> * Email the CSV file to the user notifying them the tests have finished.
> >> * Getting fancier: ability to specify a list of branches or tags from
> git
> >> or subversion as inputs, which would allow the developer to tag many
> >> different performance changes and later figure out which combination is
> the
> >> best (all before submitting a patch)
> >>
> >>
> >> On Thu, Jun 21, 2012 at 12:13 PM, Elliott Clark <[EMAIL PROTECTED]
> >wrote:
> >>
> >>> I actually think that more measurements are needed than just per
> release.
> >>> The best I could hope for would be a four node+ cluster(One master and
> >>> three slaves) that for every check in on trunk run multiple different
> perf
> >>> tests.
> >>>
> >>>
> >>>  - All Reads (Scans)
> >>>  - Large Writes (Should test compactions/flushes)
> >>>  - Read Dominated with 10% writes
> >>>
> >>> Then every checkin can be evaluated and large regressions can be
> treated as
> >>> bugs.  And with that we can see the difference between the different
> >>> versions as well. http://arewefastyet.com/ is kind of the model that I
> >>> would love to see.  And I'm more than willing to help where ever
> needed.
> >>>
> >>> However in reality every night will probably be more feasible.   And
> Four
> >>> nodes is probably not going to happen either.
> >>>
> >>> On Thu, Jun 21, 2012 at 11:38 AM, Andrew Purtell <[EMAIL PROTECTED]
> >>>> wrote:
> >>>
> >>>> On Wed, Jun 20, 2012 at 10:37 PM, Ryan Ausanka-Crues
> >>>> <[EMAIL PROTECTED]> wrote:
> >>>>> I think it makes sense to start by defining the goals for the
> >>>> performance testing project and then deciding what we'd like to
> >>> accomplish.
> >>>> As such, I start by soliciting ideas from everyone on what they would
> >>> like
> >>>> to see from the project. We can then collate those thoughts and
> >>> prioritize
> >>>> the different features. Does that sound like a reasonable approach?
> >>>>
> >>>> In terms of defining a goal, the fundamental need I see for us as a