Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Accumulo >> mail # dev >> a model for accumulo write scaling performance


+
Aaron Cordova 2012-02-24, 19:35
+
Clint Green 2012-02-24, 20:03
+
Keith Turner 2012-02-24, 20:27
Copy link to this message
-
RE: a model for accumulo write scaling performance


 

That may be a good metric for your workload on EC2 virtualized hardware at
different scales; could be useful for regression testing different versions
of Hadoop + Accumulo. Certainly workload and hardware differences could end
up with a different model.

 

From: Aaron Cordova [mailto:[EMAIL PROTECTED]]
Sent: Friday, February 24, 2012 2:36 PM
To: [EMAIL PROTECTED]
Subject: a model for accumulo write scaling performance

 

In my experience with Accumulo on EC2, I've seen about an 85% increase in
aggregate write rate each time the size of the cluster is doubled. I've
tried to capture that behavior in a model to help myself understand it.

 

The model I came up with is the following:

 

where

            w: aggregate write rate (writes per second)

            m: number of machines

            k: standalone single server performance (in my experience about
30k writes per second on average)

 

the units of k and w are writes per second

 

for those of you without the ability to see graphics in email, the model is:

            

            w = m * pow(0.85, log(m, 2)) * k

 

First of all, my algebra may be rusty, so it may be possible to simplify the
model ... second, does the model make sense?

NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB