Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Accumulo, mail # dev - a model for accumulo write scaling performance


Copy link to this message
-
a model for accumulo write scaling performance
Aaron Cordova 2012-02-24, 19:35
In my experience with Accumulo on EC2, I've seen about an 85% increase in aggregate write rate each time the size of the cluster is doubled. I've tried to capture that behavior in a model to help myself understand it.

The model I came up with is the following:
where
w: aggregate write rate (writes per second)
m: number of machines
k: standalone single server performance (in my experience about 30k writes per second on average)

the units of k and w are writes per second

for those of you without the ability to see graphics in email, the model is:

w = m * pow(0.85, log(m, 2)) * k

First of all, my algebra may be rusty, so it may be possible to simplify the model ... second, does the model make sense?