Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - HBase Random Read latency > 100ms


Copy link to this message
-
Re: HBase Random Read latency > 100ms
Ramu M S 2013-10-09, 07:11
Hi All,

Sorry. There was some mistake in the tests (Clients were not reduced,
forgot to change the parameter before running tests).

With 8 Clients and,

SCR Enabled : Average Latency is 25 ms, IO Wait % is around 8
SCR Disabled: Average Latency is 10 ms, IO Wait % is around 2

Still, SCR disabled gives better results, which confuse me. Can anyone
clarify?

Also, I tried setting the parameter (hbase.regionserver.checksum.verify as
true) Lars suggested with SCR disabled.
Average Latency is around 9.8 ms, a fraction lesser.

Thanks
Ramu
On Wed, Oct 9, 2013 at 3:32 PM, Ramu M S <[EMAIL PROTECTED]> wrote:

> Hi All,
>
> I just ran only 8 parallel clients,
>
> With SCR Enabled : Average Latency is 80 ms, IO Wait % is around 8
> With SCR Disabled: Average Latency is 40 ms, IO Wait % is around 2
>
> I always thought SCR enabled, allows a client co-located with the DataNode
> to read HDFS file blocks directly. This gives a performance boost to
> distributed clients that are aware of locality.
>
> Is my understanding wrong OR it doesn't apply to my scenario?
>
> Meanwhile I will try setting the parameter suggested by Lars and post you
> the results.
>
> Thanks,
> Ramu
>
>
> On Wed, Oct 9, 2013 at 2:29 PM, lars hofhansl <[EMAIL PROTECTED]> wrote:
>
>> Good call.
>> Could try to enable hbase.regionserver.checksum.verify, which will cause
>> HBase to do its own checksums rather than relying on HDFS (and which saves
>> 1 IO per block get).
>>
>> I do think you can expect the index blocks to be cached at all times.
>>
>> -- Lars
>> ________________________________
>> From: Vladimir Rodionov <[EMAIL PROTECTED]>
>> To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
>> Sent: Tuesday, October 8, 2013 8:44 PM
>> Subject: RE: HBase Random Read latency > 100ms
>>
>>
>> Upd.
>>
>> Each HBase Get = 2 HDFS read IO (index block + data block)= 4 File IO
>> (data + .crc) in a worst case. I think if Bloom Filter is enabled than
>> it is going to be 6 File IO in a worst case (large data set), therefore
>> you will have not 5 IO requests in queue but up to 20-30 IO requests in a
>> queue
>> This definitely explains > 100ms avg latency.
>>
>>
>>
>> Best regards,
>> Vladimir Rodionov
>> Principal Platform Engineer
>> Carrier IQ, www.carrieriq.com
>> e-mail: [EMAIL PROTECTED]
>>
>> ________________________________________
>>
>> From: Vladimir Rodionov
>> Sent: Tuesday, October 08, 2013 7:24 PM
>> To: [EMAIL PROTECTED]
>> Subject: RE: HBase Random Read latency > 100ms
>>
>> Ramu,
>>
>> You have 8 server boxes and 10 client. You have 40 requests in parallel -
>> 5 per RS/DN?
>>
>> You have 5 requests on random reads in a IO queue of your single RAID1.
>> With avg read latency of 10 ms, 5 requests in queue will give us 30ms. Add
>> some overhead
>> of HDFS + HBase and you will probably have your issue explained ?
>>
>> Your bottleneck is your disk system, I think. When you serve most of
>> requests from disks as in your large data set scenario, make sure you have
>> adequate disk sub-system and
>> that it is configured properly. Block Cache and OS page can not help you
>> in this case as working data set is larger than both caches.
>>
>> Good performance numbers in small data set scenario are explained by the
>> fact that data fits into OS page cache and Block Cache - you do not read
>> data from disk even if
>> you disable block cache.
>>
>>
>> Best regards,
>> Vladimir Rodionov
>> Principal Platform Engineer
>> Carrier IQ, www.carrieriq.com
>> e-mail: [EMAIL PROTECTED]
>>
>> ________________________________________
>> From: Ramu M S [[EMAIL PROTECTED]]
>> Sent: Tuesday, October 08, 2013 6:00 PM
>> To: [EMAIL PROTECTED]
>> Subject: Re: HBase Random Read latency > 100ms
>>
>> Hi All,
>>
>> After few suggestions from the mails earlier I changed the following,
>>
>> 1. Heap Size to 16 GB
>> 2. Block Size to 16KB
>> 3. HFile size to 8 GB (Table now has 256 regions, 32 per server)
>> 4. Data Locality Index is 100 in all RS
>>
>> I have clients running in 10 machines, each with 4 threads. So total 40.