Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> The write process in the Region Server


Copy link to this message
-
Re: The write process in the Region Server
Amit,

Your values for HLog block size (hbase.regionserver.hlog.blocksize,
default is the HDFS default block size (64 MB unless you've raised it
properly), too low unless you also have HLog compression) and the
factor of max-hlogs-to-keep (hbase.regionserver.maxlogs, default 32
files) can easily cause premature flushing as it is another criteria.
Given your write workload (which hit the WAL), this is definitely what
you're hitting.

On Sat, Jun 16, 2012 at 7:47 PM, Amit Sela <[EMAIL PROTECTED]> wrote:
> Thanks Doug, I read the regions section from the book like you recommended
> but I still have some questions left.
>
> When running a massive write job, the regionserver log show the memsize
> that is flushed. The problem is that most of the time the memsize is either
> much smaller then the memstore.flush.size configured (resulting in writing
> more files, which leads to frequent compactions) or bigger
> than memstore.flush.size * memstore.block.multiplier (resulting in Blocking
> updates for 'IPC Server handler # on <port>...).
> In some cases I also see HBaseServer throwing a ClosedChannelException:
> "WARN org.apache.hadoop.ipc.HBaseServer: IPC Server handler <handler #> on
> <port #> caught: java.nio.channels.ClosedChannelException"
>
> I guess these problems are also the cause for long (few minutes) pauses and
> in extreme cases Full GC during the write jobs.
>
> Any ideas anyone ?
>
> In general, I did some digging and couldn't find much about the write
> process in HBase from a "memory usage" point of view... besides the
> configurations description - maybe worth adding to the book.
>
> Thank you for all your help,
>
> Amit.
>
>
> On Mon, Jun 11, 2012 at 3:22 PM, Doug Meil <[EMAIL PROTECTED]>wrote:
>
>>
>> Hi there-
>>
>> Your understanding is on track.
>>
>>
>> You probably want to read this section..
>>
>> http://hbase.apache.org/book.html#regions.arch
>>
>> Š as it covers those topics in more detail.
>>
>>
>>
>>
>> On 6/10/12 1:02 PM, "Amit Sela" <[EMAIL PROTECTED]> wrote:
>>
>> >Hi all,
>> >
>> >I'm trying to better understand what's going on in the region server
>> >during
>> >write to HBase.
>> >
>> >As I understand the process:
>> >
>> >1. Data is written to memstore.
>> >2. Once the memstore has reached hbase.hregion.memstore.flush.size ->
>> >memstore executes flush and writes a new StoreFile.
>> >3. The number of StoreFiles increases until a compaction is triggered.
>> >
>> >To my understanding, the compaction is triggered after a compaction check
>> >is done by either CheckCompaction thread running in the background or by
>> >the flush memstore executed.
>> >The compaction triggered will be a minor compaction BUT it could promote
>> >to
>> >major if it includes all store files.
>> >When will it NOT include all store files ? say I set compactionThreshld to
>> >3, then when the 3rd (or 4th) flush is executed, a compaction wiil be
>> >triggered and will promote to major since it includes all store files.
>> >
>> >Is this right ? can anyone elaborate ?
>>
>>
>>

--
Harsh J