Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Hbase performance with HDFS


Copy link to this message
-
Re: Hbase performance with HDFS
Thanks Andrew. Really helpful. I think I have one more question right
now :) Underneath HDFS replicates blocks by default 3. Not sure how it
relates to HFile and compactions. When compaction occurs is it also
happening on the replica blocks from other nodes? If not then how does
it work when one node fails.

On Thu, Jul 7, 2011 at 1:53 PM, Andrew Purtell <[EMAIL PROTECTED]> wrote:
>> You mentioned about compactions, when do those occur and what triggers
>> them?
>
> Compactions are triggered by an algorithm that monitors the number of flush files in a store and the size of them, and is configurable in several dimensions.
>
>> Does it cause additional space usage when that happens
>
> Yes.
>
>> if it
>> does it would mean you always need to have much more disk then you
>> really need.
>
>
> Not all regions are compacted at once. Each region by default is constrained to 256 MB. Not all regions will hold the full amount of data. The result is not a perfect copy (doubling) if some data has been deleted or are associated with TTLs that have expired. The merge sorted result is moved into place and the old files are deleted as soon as the compaction completes. So how much more is "much more"? You can't write to any kind of data store on a (nearly) full volume anyway, no matter HBase/HDFS, or MySQL, or...
>
>> Since HDFS is mostly write once how are updates/deletes handled?
>
>
> Not mostly, only write once.
>
> From the BigTable paper, section 5.3: "A valid read operation is executed on a merged view of the sequence of SSTables and the memtable. Since the SSTables and the memtable are lexicographically sorted data structures, the merged view can be formed efficiently." So what this means is all the store files and the memstore serve effectively as change logs sorted in reverse chronological order.
>
> Deletes are just another write, but one that writes tombstones "covering" data with older timestamps.
>
> When serving queries, HBase searches store files back in time until it finds data at the coordinates requested or a tombstone.
>
> The process of compaction not only merge sorts a bunch of accumulated store files (from flushes) into fewer store files (or one) for read efficiency, it also performs housekeeping, dropping data "covered" by the delete tombstones. Incidentally this is also how TTLs are supported: expired values are dropped as well.
>
> Best regards,
>
>    - Andy
>
> Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)
>
>
>>________________________________
>>From: Mohit Anchlia <[EMAIL PROTECTED]>
>>To: Andrew Purtell <[EMAIL PROTECTED]>
>>Cc: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
>>Sent: Thursday, July 7, 2011 12:30 PM
>>Subject: Re: Hbase performance with HDFS
>>
>>Thanks that helps! Just few more questions:
>>
>>You mentioned about compactions, when do those occur and what triggers
>>them? Does it cause additional space usage when that happens, if it
>>does it would mean you always need to have much more disk then you
>>really need.
>>
>>Since HDFS is mostly write once how are updates/deletes handled?
>>
>>Is Hbase also suitable for Blobs?
>>
>>On Thu, Jul 7, 2011 at 12:11 PM, Andrew Purtell <[EMAIL PROTECTED]> wrote:
>>> Some thoughts off the top of my head. Lars' architecture material
>>> might/should cover this too. Pretty sure his book will.
>>> Regarding reads:
>>> One does not have to read a whole HDFS block. You can request arbitrary byte
>>> ranges with the block, via positioned reads. (It is true also that HDFS can
>>> be improved for better random reading performance in ways not necessarily
>>> yet committed to trunk or especially a 0.20.x branch with append support for
>>> HBase. See https://issues.apache.org/jira/browse/HDFS-1323)
>>> HBase holds indexes to store files in HDFS in memory. We also open all store
>>> files at the HDFS layer and stash those references. Additionally, users can
>>> specify the use of bloom filters to improve query time performance through