Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Performance test results


Copy link to this message
-
Re: Performance test results
Hi Josh,

The connection pooling code is attached AS IS (with all the usual legal
disclaimers), note that you will have to modify it a bit to get it to
compile because it depends on some internal libraries we use. In particular,
DynamicAppSettings and Log are two internal classes that do what their names
imply :)
Make sure you initialize "servers" in the NewConnection() method to an array
with your Thrift servers and you should be good to go. You use
GetConnection() to get a connection and ReturnConnection() to return it back
to the pool after you finish using it - make sure you don't close it in the
application code.

-eran

On Wed, Apr 27, 2011 at 00:30, Josh <[EMAIL PROTECTED]> wrote:

> On Tue, Apr 26, 2011 at 3:34 AM, Eran Kutner <[EMAIL PROTECTED]> wrote:
> > Hi J-D,
> > I don't think it's a Thrift issue. First, I use the TBufferedTransport
> > transport, second, I implemented my own connection pool so the same
> > connections are reused over and over again,
>
> Hey!  I'm using C#->Hbase and high on my list of things todo is
> 'Implement Thrift Connection Pooling in C#'.  You have any desire to
> release that code?
>
>
> > so there is no overhead
> > for opening and closing connections (I've verified that using
> > Wireshark), third, if it was a client capacity issue I would expect to
> > see an increase in throughput as I add more threads or run the test on
> > two servers in parallel, this doesn't seem to happen, the total
> > capacity remains unchanged.
> >
> > As for metrics, I already have it configured and monitored using
> > Zabbix, but it only monitors specific counters, so let me know what
> > information you would like to see. The numbers I quoted before are
> > based on client counters and correlated with server counters ("multi"
> > for writes and "get" for reads).
> >
> > -eran
> >
> >
> >
> > On Thu, Apr 21, 2011 at 20:43, Jean-Daniel Cryans <[EMAIL PROTECTED]>
> wrote:
> >>
> >> Hey Eran,
> >>
> >> Glad you could go back to debugging performance :)
> >>
> >> The scalability issues you are seeing are unknown to me, it sounds
> >> like the client isn't pushing it enough. It reminded me of when we
> >> switched to using the native Thrift PHP extension instead of the
> >> "normal" one and we saw huge speedups. My limited knowledge of Thrift
> >> may be blinding me, but I looked around for C# Thrift performance
> >> issues and found threads like this one
> >> http://www.mail-archive.com/[EMAIL PROTECTED]/msg00320.html
> >>
> >> As you didn't really debug the speed of Thrift itself in your setup,
> >> this is one more variable in the problem.
> >>
> >> Also you don't really provide metrics about your system apart from
> >> requests/second. Would it be possible for you set them up using this
> >> guide? http://hbase.apache.org/metrics.html
> >>
> >> J-D
> >>
> >> On Thu, Apr 21, 2011 at 5:13 AM, Eran Kutner <eran@> wrote:
> >> > Hi J-D,
> >> > After stabilizing the configuration, with your great help, I was able
> >> > to go back to the the load tests. I tried using IRC, as you suggested,
> >> > to continue this discussion but because of the time difference (I'm
> >> > GMT+3) it is quite difficult to find a time when people are present
> >> > and I am available to run long tests, so I'll give the mailing list
> >> > one more try.
> >> >
> >> > I tested again on a clean table using 100 insert threads each, using a
> >> > separate keyspace within the test table. Every row had just one column
> >> > with 128 bytes of data.
> >> > With one server and one region I got about 2300 inserts per second.
> >> > After manually splitting the region I got about 3600 inserts per
> >> > second (still on one machine). After a while the regions were balanced
> >> > and one was moved to another server, that got writes to around 4500
> >> > writes per second. Additional splits and moves to more servers didn't
> >> > improve this number and the write performance stabilized at ~4000
> >> > writes/sec per server. This seems pretty low, especially considering
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB