Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Accumulo >> mail # dev >> a model for accumulo write scaling performance


Copy link to this message
-
a model for accumulo write scaling performance
In my experience with Accumulo on EC2, I've seen about an 85% increase in aggregate write rate each time the size of the cluster is doubled. I've tried to capture that behavior in a model to help myself understand it.

The model I came up with is the following:
where
w: aggregate write rate (writes per second)
m: number of machines
k: standalone single server performance (in my experience about 30k writes per second on average)

the units of k and w are writes per second

for those of you without the ability to see graphics in email, the model is:

w = m * pow(0.85, log(m, 2)) * k

First of all, my algebra may be rusty, so it may be possible to simplify the model ... second, does the model make sense?
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB