I've done similar work couple of months ago. Start by sharing more
details on your program, hbase setup, and the way you measure network
and disk bottlenecks.
Also, have you isolated network and disk on all nodes and between all
nodes? (Each two nodes)) Test them separately and give us those
Next do a copyFromLocal to hdfs from master on a file which at least
the size of your machine memory (to make sure you write to disk and
not Linux memory). Tell us the copy throughput.
Sent from my iPhone
On 11 בינו 2013, at 06:31, Bryan Keller <[EMAIL PROTECTED]> wrote:
> I am attempting to configure HBase to maximize throughput, and have noticed some bottlenecks. In particular, with my configuration, write performance is well below theoretical throughput. I have a test program that inserts many rows into a test table. Network I/O is less than 20% of max, and disk I/O is even lower, maybe around 5% max on all boxes in the cluster. CPU is well below than 50% max on all boxes. I do not see any I/O waits or anything in particular than raises concerns. I am using iostat and iftop to test throughput. To determine theoretical max, I used dd and iperf. I have spent quite a bit of time optimizing the HBase config parameters, optimizing GC, etc., and am familiar with the HBase book online and such.
Bryan Keller 2013-01-11, 17:37
Bryan Keller 2013-01-15, 17:28
Andrew Purtell 2013-01-15, 17:48
anil gupta 2013-01-15, 20:04