Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - HBase Random Read latency > 100ms


Copy link to this message
-
Re: HBase Random Read latency > 100ms
Ramu M S 2013-10-08, 05:55
Hi All,

Average Latency is still around 80ms.
I have done the following,

1. Enabled Snappy Compression
2. Reduce the HFile size to 8 GB

Should I attribute these results to bad Disk Configuration OR anything else
to investigate?

- Ramu
On Tue, Oct 8, 2013 at 10:56 AM, Ramu M S <[EMAIL PROTECTED]> wrote:

> Vladimir,
>
> Thanks for the Insights into Future Caching features. Looks very
> interesting.
>
> - Ramu
>
>
> On Tue, Oct 8, 2013 at 10:45 AM, Vladimir Rodionov <
> [EMAIL PROTECTED]> wrote:
>
>> Ramu,
>>
>> If your working set of data fits into 192GB you may get additional boost
>> by utilizing OS page cache, or wait until
>> 0.98 release which introduces new bucket cache implementation (port of
>> Facebook L2 cache). You can try vanilla bucket cache in 0.96 (not released
>> yet
>> but is due soon). Both caches stores data off-heap, but Facebook version
>> can store encoded and compressed data and vanilla bucket cache does not.
>> There are some options how to utilize efficiently available RAM (at least
>> in upcoming HBase releases)
>> . If your data set does not fit RAM then your only hope is your 24 SAS
>> drives. Depending on your RAID settings, disk IO perf, HDFS configuration
>> (I think the latest Hadoop is preferable here).
>>
>> OS page cache is most vulnerable and volatile, it can not be controlled
>> and can be easily polluted by either some other processes or by HBase
>> itself (long scan).
>> With Block cache you have more control but the first truly usable
>> *official* implementation is going to be a part of 0.98 release.
>>
>> As far as I understand, your use case would definitely covered by
>> something similar to BigTable ScanCache (RowCache) , but there is no such
>> cache in HBase yet.
>> One major advantage of RowCache vs BlockCache (apart from being much more
>> efficient in RAM usage) is resilience to Region compactions. Each minor
>> Region compaction invalidates partially
>> Region's data in BlockCache and major compaction invalidates this
>> Region's data completely. This is not the case with RowCache (would it be
>> implemented).
>>
>> Best regards,
>> Vladimir Rodionov
>> Principal Platform Engineer
>> Carrier IQ, www.carrieriq.com
>> e-mail: [EMAIL PROTECTED]
>>
>> ________________________________________
>> From: Ramu M S [[EMAIL PROTECTED]]
>> Sent: Monday, October 07, 2013 5:25 PM
>> To: [EMAIL PROTECTED]
>> Subject: Re: HBase Random Read latency > 100ms
>>
>> Vladimir,
>>
>> Yes. I am fully aware of the HDD limitation and wrong configurations wrt
>> RAID.
>> Unfortunately, the hardware is leased from others for this work and I
>> wasn't consulted to decide the h/w specification for the tests that I am
>> doing now. Even the RAID cannot be turned off or set to RAID-0
>>
>> Production system is according to the Hadoop needs (100 Nodes with 16 Core
>> CPU, 192 GB RAM, 24 X 600GB SAS Drives, RAID cannot be completely turned
>> off, so we are creating 1 Virtual Disk containing only 1 Physical Disk and
>> the VD RAID level set to* *RAID-0). These systems are still not
>> available. If
>> you have any suggestion on the production setup, I will be glad to hear.
>>
>> Also, as pointed out earlier, we are planning to use HBase also as an in
>> memory KV store to access the latest data.
>> That's why RAM was considered huge in this configuration. But looks like
>> we
>> would run into more problems than any gains from this.
>>
>> Keeping that aside, I was trying to get the maximum out of the current
>> cluster or as you said Is 500-1000 OPS the max I could get out of this
>> setup?
>>
>> Regards,
>> Ramu
>>
>>
>>
>> Confidentiality Notice:  The information contained in this message,
>> including any attachments hereto, may be confidential and is intended to be
>> read only by the individual or entity to whom this message is addressed. If
>> the reader of this message is not the intended recipient or an agent or
>> designee of the intended recipient, please note that any review, use,