Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # dev >> HBase read perfomnance and HBase client


Copy link to this message
-
Re: HBase read perfomnance and HBase client
All tests I have run were hitting single region on a region server. I
suspect this is not a right scenario. There are some points in the Store
class which are heavily synchronized:

For example this one:
  // All access must be synchronized.
  private final CopyOnWriteArraySet<ChangedReadersObserver>
changedReaderObservers     new CopyOnWriteArraySet<ChangedReadersObserver>();

I will re-run tests against all available regions on a RS and will post
results later on today.
On Wed, Jul 31, 2013 at 11:15 PM, lars hofhansl <[EMAIL PROTECTED]> wrote:

> Yeah, that would seem to indicate that seeking into the block is not a
> bottleneck (and you said earlier that everything fits into the blockcache).
> Need to profile to know more. If you have time, would be cool if you can
> start jvisualvm and attach it to the RS start the profiling and let the
> workload run for a bit.
>
> -- Lars
>
>
>
> ----- Original Message -----
> From: Vladimir Rodionov <[EMAIL PROTECTED]>
> To: [EMAIL PROTECTED]; lars hofhansl <[EMAIL PROTECTED]>
> Cc:
> Sent: Wednesday, July 31, 2013 9:57 PM
> Subject: Re: HBase read perfomnance and HBase client
>
> Smaller block size (32K) does not give any performance gain and this is
> strange, to say the least.
>
>
> On Wed, Jul 31, 2013 at 9:33 PM, lars hofhansl <[EMAIL PROTECTED]> wrote:
>
> > Would be interesting to profile MultiGet. With RTT of 0.1ms, the internal
> > RS friction is probably the main contributor.
> > In fact MultiGet just loops over the set at the RS and calls single gets
> > on the various regions.
> >
> > Each Get needs to reseek into the block (even when it is cached, since
> KVs
> > have variable size).
> >
> > There are HBASE-6136 and HBASE-8362.
> >
> >
> > -- Lars
> >
> > ________________________________
> > From: Vladimir Rodionov <[EMAIL PROTECTED]>
> > To: [EMAIL PROTECTED]; lars hofhansl <[EMAIL PROTECTED]>
> > Sent: Wednesday, July 31, 2013 7:27 PM
> > Subject: Re: HBase read perfomnance and HBase client
> >
> >
> > Some final numbers :
> >
> > Test config:
> >
> > HBase 0.94.6
> > blockcache=true, block size = 64K, KV size = 62 bytes (raw).
> >
> > 5 Clients: 96GB, 16(32) CPUs (2.2Ghz), CentOS 5.7
> > 1 RS Server: the same config.
> >
> > Local network with ping between hosts: 0.1 ms
> >
> >
> > 1. HBase client hits the wall at ~ 50K per sec regardless of # of CPU,
> > threads, IO pool size and other settings.
> > 2. HBase server was able to sustain 170K per sec (with 64K block size).
> All
> > from block cache. KV size = 62 bytes (very small). This is for single Get
> > op, 60 threads per client, 5 clients (on different hosts)
> > 3. Multi - get hits the wall at the same 170K-200K per sec. Batch size
> > tested: 30, 100. The same performance absolutely as with batch size = 1.
> > Multi get has some internal issues on RegionServer side. May be excessive
> > locking or some thing else.
> >
> >
> >
> >
> >
> > On Tue, Jul 30, 2013 at 2:01 PM, Vladimir Rodionov
> > <[EMAIL PROTECTED]>wrote:
> >
> > > 1. SCR are enabled
> > > 2. Single Configuration for all table did not work well, but I will try
> > it
> > > again
> > > 3. With Nagel I had 0.8ms avg, w/o - 0.4ms - I see the difference
> > >
> > >
> > > On Tue, Jul 30, 2013 at 1:50 PM, lars hofhansl <[EMAIL PROTECTED]>
> wrote:
> > >
> > >> With Nagle's you'd see something around 40ms. You are not saying 0.8ms
> > >> RTT is bad, right? Are you seeing ~40ms latencies?
> > >>
> > >> This thread has gotten confusing.
> > >>
> > >> I would try these:
> > >> * one Configuration for all tables. Or even use a single
> > >> HConnection/Threadpool and use the HTable(byte[], HConnection,
> > >> ExecutorService) constructor
> > >> * disable Nagle's: set both ipc.server.tcpnodelay and
> > >> hbase.ipc.client.tcpnodelay to true in hbase-site.xml (both client
> *and*
> > >> server)
> > >> * increase hbase.client.ipc.pool.size in client's hbase-site.xml
> > >> * enable short circuit reads (details depend on exact version of
> > Hadoop).