Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> [maybe off-topic?] article: Solving Big Data Challenges for Enterprise Application Performance Management


Copy link to this message
-
Re: [maybe off-topic?] article: Solving Big Data Challenges for Enterprise Application Performance Management
This paper could very well have benchmarked the relative performance of the
YCSB drivers. Some take aways for me here are:

    - Cluster setup is too difficult still

    - There are opportunities for autotuning that would make it easier for
users to get it right the first time and for academics and casual
benchmarkers alike to get a good result without becoming experts with HBase
configuration

    - The client library has been evolving toward fully async dispatch, we
should focus on this, perhaps even consider reimplementing sync client on a
refactored async core. And look at making the Thrift based stuff FB put in
front and center, because then native clients are possible.

    - Given the above client work, the YCSB HBase driver should have a
rewrite.

On Thu, Aug 30, 2012 at 4:49 PM, Dave Wang <[EMAIL PROTECTED]> wrote:

> My reading of the paper is that they are actually not clear about whether
> or not HMasters were deployed on datanodes.
>
> I'm going to guess that they just used default configurations for HBase and
> YCSB, but the paper again is not specific enough.
>
> Why were they using 0.90.4 in 2012?  Would have been nice to see some of
> the more recent work done in the area of performance.
>
> One thing the paper does touch on is the relative difficulty of standing up
> the cluster, which has not changed since 0.90.4.  I think that's definitely
> something that could be improved upon.
>
> - Dave
>
> On Thu, Aug 30, 2012 at 6:27 AM, Cristofer Weber <
> [EMAIL PROTECTED]> wrote:
>
> > Just read this article, "Solving Big Data Challenges for Enterprise
> > Application Performance Management." published this month @ Volume 5,
> No.12
> > of Proceedings of the VLDB Endowment, where they measured 6 different
> > databases - Project Voldemort, Redis, HBase, Cassandra, MySQL Cluster and
> > VoltDB - with YCSB on two different kind of clusters, Memory-bound and
> > Disk-bound,  and I'm in doubt about results for HBase since:
> >
> >
> > *         HBase version was 0.90.4
> >
> > *         Master nodes were deployed together with data nodes
> >
> > *         They didn't reported tuning parameters
> >
> > There's also a paragraph where they reported that HBase failed frequently
> > in non-deterministic ways while running YCSB.
> >
> > My intention with this e-mail is to look for opinions from you, who are
> > more experienced with HBase, on where this experiment's setup could be
> > changed to improve read operations, since in this setup HBase did not
> > performed as well as Cassandra and Project Voldemort.
> >
> > Here's the article:
> > http://vldb.org/pvldb/vol5/p1724_tilmannrabl_vldb2012.pdf and Volume 5
> > home: http://vldb.org/pvldb/vol5.html
> >
> >
> >
> >
>

--
Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB