Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # dev - HBase read perfomnance and HBase client


Copy link to this message
-
Re: HBase read perfomnance and HBase client
Ted Yu 2013-08-01, 16:27
Vlad:
You might want to look at HBASE-9087 Handlers being blocked during reads

On Thu, Aug 1, 2013 at 9:24 AM, Vladimir Rodionov <[EMAIL PROTECTED]>wrote:

> All tests I have run were hitting single region on a region server. I
> suspect this is not a right scenario. There are some points in the Store
> class which are heavily synchronized:
>
> For example this one:
>   // All access must be synchronized.
>   private final CopyOnWriteArraySet<ChangedReadersObserver>
> changedReaderObservers >     new CopyOnWriteArraySet<ChangedReadersObserver>();
>
> I will re-run tests against all available regions on a RS and will post
> results later on today.
>
>
>
>
> On Wed, Jul 31, 2013 at 11:15 PM, lars hofhansl <[EMAIL PROTECTED]> wrote:
>
> > Yeah, that would seem to indicate that seeking into the block is not a
> > bottleneck (and you said earlier that everything fits into the
> blockcache).
> > Need to profile to know more. If you have time, would be cool if you can
> > start jvisualvm and attach it to the RS start the profiling and let the
> > workload run for a bit.
> >
> > -- Lars
> >
> >
> >
> > ----- Original Message -----
> > From: Vladimir Rodionov <[EMAIL PROTECTED]>
> > To: [EMAIL PROTECTED]; lars hofhansl <[EMAIL PROTECTED]>
> > Cc:
> > Sent: Wednesday, July 31, 2013 9:57 PM
> > Subject: Re: HBase read perfomnance and HBase client
> >
> > Smaller block size (32K) does not give any performance gain and this is
> > strange, to say the least.
> >
> >
> > On Wed, Jul 31, 2013 at 9:33 PM, lars hofhansl <[EMAIL PROTECTED]> wrote:
> >
> > > Would be interesting to profile MultiGet. With RTT of 0.1ms, the
> internal
> > > RS friction is probably the main contributor.
> > > In fact MultiGet just loops over the set at the RS and calls single
> gets
> > > on the various regions.
> > >
> > > Each Get needs to reseek into the block (even when it is cached, since
> > KVs
> > > have variable size).
> > >
> > > There are HBASE-6136 and HBASE-8362.
> > >
> > >
> > > -- Lars
> > >
> > > ________________________________
> > > From: Vladimir Rodionov <[EMAIL PROTECTED]>
> > > To: [EMAIL PROTECTED]; lars hofhansl <[EMAIL PROTECTED]>
> > > Sent: Wednesday, July 31, 2013 7:27 PM
> > > Subject: Re: HBase read perfomnance and HBase client
> > >
> > >
> > > Some final numbers :
> > >
> > > Test config:
> > >
> > > HBase 0.94.6
> > > blockcache=true, block size = 64K, KV size = 62 bytes (raw).
> > >
> > > 5 Clients: 96GB, 16(32) CPUs (2.2Ghz), CentOS 5.7
> > > 1 RS Server: the same config.
> > >
> > > Local network with ping between hosts: 0.1 ms
> > >
> > >
> > > 1. HBase client hits the wall at ~ 50K per sec regardless of # of CPU,
> > > threads, IO pool size and other settings.
> > > 2. HBase server was able to sustain 170K per sec (with 64K block size).
> > All
> > > from block cache. KV size = 62 bytes (very small). This is for single
> Get
> > > op, 60 threads per client, 5 clients (on different hosts)
> > > 3. Multi - get hits the wall at the same 170K-200K per sec. Batch size
> > > tested: 30, 100. The same performance absolutely as with batch size > 1.
> > > Multi get has some internal issues on RegionServer side. May be
> excessive
> > > locking or some thing else.
> > >
> > >
> > >
> > >
> > >
> > > On Tue, Jul 30, 2013 at 2:01 PM, Vladimir Rodionov
> > > <[EMAIL PROTECTED]>wrote:
> > >
> > > > 1. SCR are enabled
> > > > 2. Single Configuration for all table did not work well, but I will
> try
> > > it
> > > > again
> > > > 3. With Nagel I had 0.8ms avg, w/o - 0.4ms - I see the difference
> > > >
> > > >
> > > > On Tue, Jul 30, 2013 at 1:50 PM, lars hofhansl <[EMAIL PROTECTED]>
> > wrote:
> > > >
> > > >> With Nagle's you'd see something around 40ms. You are not saying
> 0.8ms
> > > >> RTT is bad, right? Are you seeing ~40ms latencies?
> > > >>
> > > >> This thread has gotten confusing.
> > > >>
> > > >> I would try these:
> > > >> * one Configuration for all tables. Or even use a single
> > > >> HConnection/Threadpool and use the HTable(byte[], HConnection,