Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # dev >> YCSB tests for HBase on Whirr (was: Report to Apache board: first cut)


+
Mingjie Lai 2011-01-21, 20:40
+
Ted Dunning 2011-01-21, 22:43
+
Andrew Purtell 2011-01-24, 00:48
+
Lars George 2011-01-24, 14:55
+
Lars George 2011-01-22, 00:00
Copy link to this message
-
Re: YCSB tests for HBase on Whirr (was: Report to Apache board: first cut)
hello Mingjie,
this comes at a very apt time for me. I will be evaluating hbase on ec2
using ycsb, and will run mapreduce jobs over there. Like for instance, I
will evaluate some simple agg ones (1512), with mapreduce jobs, coprocessor
and pure HBase APIs (like Scan + client side processing).

I have things running on local, and will move to ec2 pretty soon (by today).
Right now, zero experience with setting hbase on  ec2. I may be bugging you
guys in case I get stuck. :)

Thanks,
Himanshu

On Fri, Jan 21, 2011 at 1:40 PM, Mingjie Lai <[EMAIL PROTECTED]>wrote:

> Guys.
> There is a discussion regarding testing HBASE with YCSB on Whirr or EC2.
> Send to @dev so more people can be involved.
>
> Lars.
> I have an automatic YCSB test for HBase running on EC2. It was derived from
> Andy and Eugene's HBase EC2 script. What I added include:
> - YCSB test support
> - build and upload new HBase jar triggered by SCM(git) changes
> - email YCSB test results to configured recipients
> - automatically running as a daily cron job
>
> You can take a look at: https://github.com/mlai/hbase-ec2/tree/ycsb for
> more detail.
>
> We do want to move the script to support Whirr, but right now we're lack of
> resources to do the job. Also It seems there is a Whirr HBase bug reported
> although I haven't exactly checked the detail. So there is no further
> progress toward Whirr support right now.
>
> >> Reporting back the results will be a bit more challenging as usually
> >> you spin down the cluster at end.
> I was also bothered a lot for what could be best way to present the result
> from an automatic test. I picked the simplest way -- sending result by
> emails, so that I can avoid the problem to save the data to somewhere.
>
> But it could be extended to support Hudson. Right now it downloads the
> result files locally after YCSB tests finished, and parse the result locally
> where I grab the detail of results as email contents. I think hudson can use
> the same files to present results.
>
> >> And we do
> >> not want to keep the cluster running unnecessarily for a build in web
> >> interface to browse the results etc.
> Totally agree, we want to terminate the cluster as soon as the test
> finished.
>
> Here is an example of a test result:
> http://pastebin.com/f08bRCkY
>
> What do you think, Lars?
>
> Thanks,
> Mingjie
>
>
> -------- Original Message --------
> Subject:        Re: Report to Apache board: first cut
> Date:   Fri, 21 Jan 2011 09:46:46 -0800
> From:   Stack <[EMAIL PROTECTED]>
>
>
>
>
>
>
> +1 to Todd suggestion (and change subject -- smile)
> St.Ack
>
> On Fri, Jan 21, 2011 at 8:19 AM, Todd Lipcon<[EMAIL PROTECTED]>  wrote:
>
>>  Should we move this discussion to the dev list at large?
>>
>>  Our QA team is also starting to look at at least smoke testing HBase on a
>>  cluster. We should coordinate efforts!
>>
>>  On Fri, Jan 21, 2011 at 12:56 AM, Lars George<[EMAIL PROTECTED]>
>>  wrote:
>>
>>   Hi Andy,
>>>
>>>  I assumed as much from our previous conversations. I send Eugene the
>>>  details on Whirr and using HBase with it. Unfortunately currently
>>>  JClouds can not yet ship the scripts from the local directory, but
>>>  that is coming soon. In the meantime we need to use a "public" S3
>>>  based repo that has a copy. He had that set up last time we got HBase
>>>  running together using Whirr. I think he is pretty much set, we simply
>>>  need to add a specific "test" role that allows us to start the cluster
>>>  and when "test" is part of the template we can not only start the
>>>  cluster but invoke whatever test we need. In effect we could have
>>>  "test-ycsb-basic", "test-ycsb-workload-5050", "test-mvn-test" (for the
>>>  build in tests) and so on to start this. That has the advantage of
>>>  being able to use various templates to test different cluster setups
>>>  against equally different test scenarios.
>>>
>>>  Reporting back the results will be a bit more challenging as usually
>>>  you spin down the cluster at end. We could grab whatever the test
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB