Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Performance test results


Copy link to this message
-
Re: Performance test results
Hi J-D,
I don't think it's a Thrift issue. First, I use the TBufferedTransport
transport, second, I implemented my own connection pool so the same
connections are reused over and over again, so there is no overhead
for opening and closing connections (I've verified that using
Wireshark), third, if it was a client capacity issue I would expect to
see an increase in throughput as I add more threads or run the test on
two servers in parallel, this doesn't seem to happen, the total
capacity remains unchanged.

As for metrics, I already have it configured and monitored using
Zabbix, but it only monitors specific counters, so let me know what
information you would like to see. The numbers I quoted before are
based on client counters and correlated with server counters ("multi"
for writes and "get" for reads).

-eran

On Thu, Apr 21, 2011 at 20:43, Jean-Daniel Cryans <[EMAIL PROTECTED]> wrote:
>
> Hey Eran,
>
> Glad you could go back to debugging performance :)
>
> The scalability issues you are seeing are unknown to me, it sounds
> like the client isn't pushing it enough. It reminded me of when we
> switched to using the native Thrift PHP extension instead of the
> "normal" one and we saw huge speedups. My limited knowledge of Thrift
> may be blinding me, but I looked around for C# Thrift performance
> issues and found threads like this one
> http://www.mail-archive.com/[EMAIL PROTECTED]/msg00320.html
>
> As you didn't really debug the speed of Thrift itself in your setup,
> this is one more variable in the problem.
>
> Also you don't really provide metrics about your system apart from
> requests/second. Would it be possible for you set them up using this
> guide? http://hbase.apache.org/metrics.html
>
> J-D
>
> On Thu, Apr 21, 2011 at 5:13 AM, Eran Kutner <eran@> wrote:
> > Hi J-D,
> > After stabilizing the configuration, with your great help, I was able
> > to go back to the the load tests. I tried using IRC, as you suggested,
> > to continue this discussion but because of the time difference (I'm
> > GMT+3) it is quite difficult to find a time when people are present
> > and I am available to run long tests, so I'll give the mailing list
> > one more try.
> >
> > I tested again on a clean table using 100 insert threads each, using a
> > separate keyspace within the test table. Every row had just one column
> > with 128 bytes of data.
> > With one server and one region I got about 2300 inserts per second.
> > After manually splitting the region I got about 3600 inserts per
> > second (still on one machine). After a while the regions were balanced
> > and one was moved to another server, that got writes to around 4500
> > writes per second. Additional splits and moves to more servers didn't
> > improve this number and the write performance stabilized at ~4000
> > writes/sec per server. This seems pretty low, especially considering
> > other numbers I've seen around here.
> >
> > Read performance is at around 1500 rows per second per server, which
> > seems extremely low to me, especially considering that all the working
> > set I was querying could fit in the servers memory. To make the test
> > interesting I limited my client to fetch only 1 row (always the same
> > one) from each keyspace, that yielded 10K reads per sec per server, so
> > I tried increasing the range again a read the same 10 rows, now the
> > performance dropped to 8500 reads/sec per server. Increasing the range
> > to 100 rows and the performance drops to around 3500 reads per second
> > per server.
> > Do you have any idea what could explain this behavior and how do I get
> > a decent number of reads from those servers?
> >
> > -eran