Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # dev >> HBase read perfomnance and HBase client


Copy link to this message
-
Re: HBase read perfomnance and HBase client
Micheal, network is not a bottleneck as since raw KV size is 62 bytes. 1GbE
can pump > 1 M per sec of these objects.

block cache is enabled, size ~ 2GB, query data set is less than 1MB, block
cache hit rate 99% (I think its 99.99% in reality)
On Thu, Aug 1, 2013 at 12:10 PM, Michael Segel <[EMAIL PROTECTED]>wrote:

> Ok... Bonded 1GbE is less than 2GbE, not sure of actual max throughput.
>
> Are you hitting data in cache or are you fetching data from disk?
> I mean can we rule out disk I/O because the data would most likely be in
> cache?
>
> Are you monitoring your cluster w Ganglia? What do you see in terms of
> network traffic?
> Are all of the nodes in the test cluster on the same switch? Including the
> client?
>
>
> (Sorry, I'm currently looking at a network problem so now everything I see
> may be a networking problem. And a guy from Arista found me after our
> meetup last night so I am thinking about the impact on networking in the
> ecosystem. :-).  )
>
>
> -Just some guy out in left field...
>
> Sent from a remote device. Please excuse any typos...
>
> Mike Segel
>
> On Aug 1, 2013, at 1:11 PM, "Vladimir Rodionov" <[EMAIL PROTECTED]>
> wrote:
>
> > 2x1Gb bonded, I think. This is our standard config.
> >
> >
> > On Thu, Aug 1, 2013 at 10:27 AM, Michael Segel <
> [EMAIL PROTECTED]>wrote:
> >
> >> Network? 1GbE or 10GbE?
> >>
> >> Sent from a remote device. Please excuse any typos...
> >>
> >> Mike Segel
> >>
> >> On Jul 31, 2013, at 9:27 PM, "Vladimir Rodionov" <
> [EMAIL PROTECTED]>
> >> wrote:
> >>
> >>> Some final numbers :
> >>>
> >>> Test config:
> >>>
> >>> HBase 0.94.6
> >>> blockcache=true, block size = 64K, KV size = 62 bytes (raw).
> >>>
> >>> 5 Clients: 96GB, 16(32) CPUs (2.2Ghz), CentOS 5.7
> >>> 1 RS Server: the same config.
> >>>
> >>> Local network with ping between hosts: 0.1 ms
> >>>
> >>>
> >>> 1. HBase client hits the wall at ~ 50K per sec regardless of # of CPU,
> >>> threads, IO pool size and other settings.
> >>> 2. HBase server was able to sustain 170K per sec (with 64K block size).
> >> All
> >>> from block cache. KV size = 62 bytes (very small). This is for single
> Get
> >>> op, 60 threads per client, 5 clients (on different hosts)
> >>> 3. Multi - get hits the wall at the same 170K-200K per sec. Batch size
> >>> tested: 30, 100. The same performance absolutely as with batch size > 1.
> >>> Multi get has some internal issues on RegionServer side. May be
> excessive
> >>> locking or some thing else.
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> On Tue, Jul 30, 2013 at 2:01 PM, Vladimir Rodionov
> >>> <[EMAIL PROTECTED]>wrote:
> >>>
> >>>> 1. SCR are enabled
> >>>> 2. Single Configuration for all table did not work well, but I will
> try
> >> it
> >>>> again
> >>>> 3. With Nagel I had 0.8ms avg, w/o - 0.4ms - I see the difference
> >>>>
> >>>>
> >>>> On Tue, Jul 30, 2013 at 1:50 PM, lars hofhansl <[EMAIL PROTECTED]>
> >> wrote:
> >>>>
> >>>>> With Nagle's you'd see something around 40ms. You are not saying
> 0.8ms
> >>>>> RTT is bad, right? Are you seeing ~40ms latencies?
> >>>>>
> >>>>> This thread has gotten confusing.
> >>>>>
> >>>>> I would try these:
> >>>>> * one Configuration for all tables. Or even use a single
> >>>>> HConnection/Threadpool and use the HTable(byte[], HConnection,
> >>>>> ExecutorService) constructor
> >>>>> * disable Nagle's: set both ipc.server.tcpnodelay and
> >>>>> hbase.ipc.client.tcpnodelay to true in hbase-site.xml (both client
> >> *and*
> >>>>> server)
> >>>>> * increase hbase.client.ipc.pool.size in client's hbase-site.xml
> >>>>> * enable short circuit reads (details depend on exact version of
> >> Hadoop).
> >>>>> Google will help :)
> >>>>>
> >>>>> -- Lars
> >>>>>
> >>>>>
> >>>>> ----- Original Message -----
> >>>>> From: Vladimir Rodionov <[EMAIL PROTECTED]>
> >>>>> To: [EMAIL PROTECTED]
> >>>>> Cc:
> >>>>> Sent: Tuesday, July 30, 2013 1:30 PM
> >>>>> Subject: Re: HBase read perfomnance and HBase client
> >>>>>
> >>>>> This hbase.ipc.client.tcpnodelay (default - false) explains poor
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB