just brainstorming =)
Some of those are motivated by the performance tests i wrote for data block
In that directory:
* SeekBenchmarkMain gathers all of the test parameters. Perhaps we could
have a test configuration input file format where standard test configs are
put in source control
* For each combination of input parameters it runs a SingleSeekBenchmark
* As it runs, the SingleSeekBenchmark adds results to a SeekBenchmarkResult
* Each SeekBenchmarkResult is logged after each SingleSeekBenchmark, and
all of them are logged again at the end for pasting into a spreadsheet
They're probably too customized to my use case, but maybe we can draw ideas
from the structure/workflow and make it applicable to more use cases.
On Thu, Jun 21, 2012 at 2:47 PM, Andrew Purtell <[EMAIL PROTECTED]> wrote:
> Concur. That's ambitious!
> On Thu, Jun 21, 2012 at 1:57 PM, Ryan Ausanka-Crues
> <[EMAIL PROTECTED]> wrote:
> > Thanks Matt. These are great!
> > ---
> > Ryan Ausanka-Crues
> > CEO
> > Palomino Labs, Inc.
> > [EMAIL PROTECTED]
> > (m) 805.242.2486
> > On Jun 21, 2012, at 12:36 PM, Matt Corgan wrote:
> >> These are geared more towards development than regression testing, but
> >> are a few ideas that I would find useful:
> >> * Ability to run the performance tests (or at least a subset of them)
> on a
> >> development machine would help people avoid committing regressions and
> >> would speed development in general
> >> * Ability to test a single region without heavier weight servers and
> >> clusters
> >> * Letting the test run with multiple combinations of input parameters
> >> (block size, compression, blooms, encoding, flush size, etc, etc).
> >> Possibly many combinations that could take a while to run
> >> * Output results to a CSV file that's importable to a spreadsheet for
> >> sorting/filtering/charting.
> >> * Email the CSV file to the user notifying them the tests have finished.
> >> * Getting fancier: ability to specify a list of branches or tags from
> >> or subversion as inputs, which would allow the developer to tag many
> >> different performance changes and later figure out which combination is
> >> best (all before submitting a patch)
> >> On Thu, Jun 21, 2012 at 12:13 PM, Elliott Clark <[EMAIL PROTECTED]
> >>> I actually think that more measurements are needed than just per
> >>> The best I could hope for would be a four node+ cluster(One master and
> >>> three slaves) that for every check in on trunk run multiple different
> >>> tests.
> >>> - All Reads (Scans)
> >>> - Large Writes (Should test compactions/flushes)
> >>> - Read Dominated with 10% writes
> >>> Then every checkin can be evaluated and large regressions can be
> treated as
> >>> bugs. And with that we can see the difference between the different
> >>> versions as well. http://arewefastyet.com/ is kind of the model that I
> >>> would love to see. And I'm more than willing to help where ever
> >>> However in reality every night will probably be more feasible. And
> >>> nodes is probably not going to happen either.
> >>> On Thu, Jun 21, 2012 at 11:38 AM, Andrew Purtell <[EMAIL PROTECTED]
> >>>> wrote:
> >>>> On Wed, Jun 20, 2012 at 10:37 PM, Ryan Ausanka-Crues
> >>>> <[EMAIL PROTECTED]> wrote:
> >>>>> I think it makes sense to start by defining the goals for the
> >>>> performance testing project and then deciding what we'd like to
> >>> accomplish.
> >>>> As such, I start by soliciting ideas from everyone on what they would
> >>> like
> >>>> to see from the project. We can then collate those thoughts and
> >>> prioritize
> >>>> the different features. Does that sound like a reasonable approach?
> >>>> In terms of defining a goal, the fundamental need I see for us as a