Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> HBase Random Read latency > 100ms


Copy link to this message
-
Re: HBase Random Read latency > 100ms
Hi All,

Average Latency is still around 80ms.
I have done the following,

1. Enabled Snappy Compression
2. Reduce the HFile size to 8 GB

Should I attribute these results to bad Disk Configuration OR anything else
to investigate?

- Ramu
On Tue, Oct 8, 2013 at 10:56 AM, Ramu M S <[EMAIL PROTECTED]> wrote:

> Vladimir,
>
> Thanks for the Insights into Future Caching features. Looks very
> interesting.
>
> - Ramu
>
>
> On Tue, Oct 8, 2013 at 10:45 AM, Vladimir Rodionov <
> [EMAIL PROTECTED]> wrote:
>
>> Ramu,
>>
>> If your working set of data fits into 192GB you may get additional boost
>> by utilizing OS page cache, or wait until
>> 0.98 release which introduces new bucket cache implementation (port of
>> Facebook L2 cache). You can try vanilla bucket cache in 0.96 (not released
>> yet
>> but is due soon). Both caches stores data off-heap, but Facebook version
>> can store encoded and compressed data and vanilla bucket cache does not.
>> There are some options how to utilize efficiently available RAM (at least
>> in upcoming HBase releases)
>> . If your data set does not fit RAM then your only hope is your 24 SAS
>> drives. Depending on your RAID settings, disk IO perf, HDFS configuration
>> (I think the latest Hadoop is preferable here).
>>
>> OS page cache is most vulnerable and volatile, it can not be controlled
>> and can be easily polluted by either some other processes or by HBase
>> itself (long scan).
>> With Block cache you have more control but the first truly usable
>> *official* implementation is going to be a part of 0.98 release.
>>
>> As far as I understand, your use case would definitely covered by
>> something similar to BigTable ScanCache (RowCache) , but there is no such
>> cache in HBase yet.
>> One major advantage of RowCache vs BlockCache (apart from being much more
>> efficient in RAM usage) is resilience to Region compactions. Each minor
>> Region compaction invalidates partially
>> Region's data in BlockCache and major compaction invalidates this
>> Region's data completely. This is not the case with RowCache (would it be
>> implemented).
>>
>> Best regards,
>> Vladimir Rodionov
>> Principal Platform Engineer
>> Carrier IQ, www.carrieriq.com
>> e-mail: [EMAIL PROTECTED]
>>
>> ________________________________________
>> From: Ramu M S [[EMAIL PROTECTED]]
>> Sent: Monday, October 07, 2013 5:25 PM
>> To: [EMAIL PROTECTED]
>> Subject: Re: HBase Random Read latency > 100ms
>>
>> Vladimir,
>>
>> Yes. I am fully aware of the HDD limitation and wrong configurations wrt
>> RAID.
>> Unfortunately, the hardware is leased from others for this work and I
>> wasn't consulted to decide the h/w specification for the tests that I am
>> doing now. Even the RAID cannot be turned off or set to RAID-0
>>
>> Production system is according to the Hadoop needs (100 Nodes with 16 Core
>> CPU, 192 GB RAM, 24 X 600GB SAS Drives, RAID cannot be completely turned
>> off, so we are creating 1 Virtual Disk containing only 1 Physical Disk and
>> the VD RAID level set to* *RAID-0). These systems are still not
>> available. If
>> you have any suggestion on the production setup, I will be glad to hear.
>>
>> Also, as pointed out earlier, we are planning to use HBase also as an in
>> memory KV store to access the latest data.
>> That's why RAM was considered huge in this configuration. But looks like
>> we
>> would run into more problems than any gains from this.
>>
>> Keeping that aside, I was trying to get the maximum out of the current
>> cluster or as you said Is 500-1000 OPS the max I could get out of this
>> setup?
>>
>> Regards,
>> Ramu
>>
>>
>>
>> Confidentiality Notice:  The information contained in this message,
>> including any attachments hereto, may be confidential and is intended to be
>> read only by the individual or entity to whom this message is addressed. If
>> the reader of this message is not the intended recipient or an agent or
>> designee of the intended recipient, please note that any review, use,
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB