Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> [maybe off-topic?] article: Solving Big Data Challenges for Enterprise Application Performance Management


Copy link to this message
-
Re: [maybe off-topic?] article: Solving Big Data Challenges for Enterprise Application Performance Management
This paper could very well have benchmarked the relative performance of the
YCSB drivers. Some take aways for me here are:

    - Cluster setup is too difficult still

    - There are opportunities for autotuning that would make it easier for
users to get it right the first time and for academics and casual
benchmarkers alike to get a good result without becoming experts with HBase
configuration

    - The client library has been evolving toward fully async dispatch, we
should focus on this, perhaps even consider reimplementing sync client on a
refactored async core. And look at making the Thrift based stuff FB put in
front and center, because then native clients are possible.

    - Given the above client work, the YCSB HBase driver should have a
rewrite.

On Thu, Aug 30, 2012 at 4:49 PM, Dave Wang <[EMAIL PROTECTED]> wrote:

> My reading of the paper is that they are actually not clear about whether
> or not HMasters were deployed on datanodes.
>
> I'm going to guess that they just used default configurations for HBase and
> YCSB, but the paper again is not specific enough.
>
> Why were they using 0.90.4 in 2012?  Would have been nice to see some of
> the more recent work done in the area of performance.
>
> One thing the paper does touch on is the relative difficulty of standing up
> the cluster, which has not changed since 0.90.4.  I think that's definitely
> something that could be improved upon.
>
> - Dave
>
> On Thu, Aug 30, 2012 at 6:27 AM, Cristofer Weber <
> [EMAIL PROTECTED]> wrote:
>
> > Just read this article, "Solving Big Data Challenges for Enterprise
> > Application Performance Management." published this month @ Volume 5,
> No.12
> > of Proceedings of the VLDB Endowment, where they measured 6 different
> > databases - Project Voldemort, Redis, HBase, Cassandra, MySQL Cluster and
> > VoltDB - with YCSB on two different kind of clusters, Memory-bound and
> > Disk-bound,  and I'm in doubt about results for HBase since:
> >
> >
> > *         HBase version was 0.90.4
> >
> > *         Master nodes were deployed together with data nodes
> >
> > *         They didn't reported tuning parameters
> >
> > There's also a paragraph where they reported that HBase failed frequently
> > in non-deterministic ways while running YCSB.
> >
> > My intention with this e-mail is to look for opinions from you, who are
> > more experienced with HBase, on where this experiment's setup could be
> > changed to improve read operations, since in this setup HBase did not
> > performed as well as Cassandra and Project Voldemort.
> >
> > Here's the article:
> > http://vldb.org/pvldb/vol5/p1724_tilmannrabl_vldb2012.pdf and Volume 5
> > home: http://vldb.org/pvldb/vol5.html
> >
> >
> >
> >
>

--
Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)