Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> HBase Random Read latency > 100ms


Copy link to this message
-
Re: HBase Random Read latency > 100ms
On Wed, Oct 9, 2013 at 10:59 AM, Vladimir Rodionov
<[EMAIL PROTECTED]>wrote:

> I can't say for SCR. There is a possibility that the feature is broken, of
> course.
> But the fact that hbase.regionserver.checksum.verify does not affect
> performance means that OS caches
> effectively HDFS checksum files.
>

See "OS cache + SCR" VS "HBase CRC over OS cache+SCR" in this document I
shared some time ago:
https://docs.google.com/spreadsheet/pub?key=0Ao87IrzZJSaydENaem5USWg4TlRKcHl0dEtTS2NBOUE&output=html

In an all-in-memory test it shows a pretty big difference.

J-D
>
> Best regards,
> Vladimir Rodionov
> Principal Platform Engineer
> Carrier IQ, www.carrieriq.com
> e-mail: [EMAIL PROTECTED]
>
> ________________________________________
> From: Ramu M S [[EMAIL PROTECTED]]
> Sent: Wednesday, October 09, 2013 12:11 AM
> To: [EMAIL PROTECTED]; lars hofhansl
> Subject: Re: HBase Random Read latency > 100ms
>
> Hi All,
>
> Sorry. There was some mistake in the tests (Clients were not reduced,
> forgot to change the parameter before running tests).
>
> With 8 Clients and,
>
> SCR Enabled : Average Latency is 25 ms, IO Wait % is around 8
> SCR Disabled: Average Latency is 10 ms, IO Wait % is around 2
>
> Still, SCR disabled gives better results, which confuse me. Can anyone
> clarify?
>
> Also, I tried setting the parameter (hbase.regionserver.checksum.verify as
> true) Lars suggested with SCR disabled.
> Average Latency is around 9.8 ms, a fraction lesser.
>
> Thanks
> Ramu
>
>
> On Wed, Oct 9, 2013 at 3:32 PM, Ramu M S <[EMAIL PROTECTED]> wrote:
>
> > Hi All,
> >
> > I just ran only 8 parallel clients,
> >
> > With SCR Enabled : Average Latency is 80 ms, IO Wait % is around 8
> > With SCR Disabled: Average Latency is 40 ms, IO Wait % is around 2
> >
> > I always thought SCR enabled, allows a client co-located with the
> DataNode
> > to read HDFS file blocks directly. This gives a performance boost to
> > distributed clients that are aware of locality.
> >
> > Is my understanding wrong OR it doesn't apply to my scenario?
> >
> > Meanwhile I will try setting the parameter suggested by Lars and post you
> > the results.
> >
> > Thanks,
> > Ramu
> >
> >
> > On Wed, Oct 9, 2013 at 2:29 PM, lars hofhansl <[EMAIL PROTECTED]> wrote:
> >
> >> Good call.
> >> Could try to enable hbase.regionserver.checksum.verify, which will cause
> >> HBase to do its own checksums rather than relying on HDFS (and which
> saves
> >> 1 IO per block get).
> >>
> >> I do think you can expect the index blocks to be cached at all times.
> >>
> >> -- Lars
> >> ________________________________
> >> From: Vladimir Rodionov <[EMAIL PROTECTED]>
> >> To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
> >> Sent: Tuesday, October 8, 2013 8:44 PM
> >> Subject: RE: HBase Random Read latency > 100ms
> >>
> >>
> >> Upd.
> >>
> >> Each HBase Get = 2 HDFS read IO (index block + data block)= 4 File IO
> >> (data + .crc) in a worst case. I think if Bloom Filter is enabled than
> >> it is going to be 6 File IO in a worst case (large data set), therefore
> >> you will have not 5 IO requests in queue but up to 20-30 IO requests in
> a
> >> queue
> >> This definitely explains > 100ms avg latency.
> >>
> >>
> >>
> >> Best regards,
> >> Vladimir Rodionov
> >> Principal Platform Engineer
> >> Carrier IQ, www.carrieriq.com
> >> e-mail: [EMAIL PROTECTED]
> >>
> >> ________________________________________
> >>
> >> From: Vladimir Rodionov
> >> Sent: Tuesday, October 08, 2013 7:24 PM
> >> To: [EMAIL PROTECTED]
> >> Subject: RE: HBase Random Read latency > 100ms
> >>
> >> Ramu,
> >>
> >> You have 8 server boxes and 10 client. You have 40 requests in parallel
> -
> >> 5 per RS/DN?
> >>
> >> You have 5 requests on random reads in a IO queue of your single RAID1.
> >> With avg read latency of 10 ms, 5 requests in queue will give us 30ms.
> Add
> >> some overhead
> >> of HDFS + HBase and you will probably have your issue explained ?
> >>
> >> Your bottleneck is your disk system, I think. When you serve most of