Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> The write process in the Region Server


Copy link to this message
-
Re: The write process in the Region Server
Hi,

No, in that case my comment can be considered incorrect. The HLog
shouldn't fill up very fast - and your problem does sound memory bound
now (upper/lower watermark hits).

On Sun, Jun 17, 2012 at 11:49 AM, Infolinks <[EMAIL PROTECTED]> wrote:
> Hi Harsh J,
>
> I'm not using WAL in my writes.
> Is there still a log rolling ?
>
> ב-Jun 17, 2012, בשעה 7:40, Harsh J <[EMAIL PROTECTED]> כתב/ה:
>
>> Amit,
>>
>> Your values for HLog block size (hbase.regionserver.hlog.blocksize,
>> default is the HDFS default block size (64 MB unless you've raised it
>> properly), too low unless you also have HLog compression) and the
>> factor of max-hlogs-to-keep (hbase.regionserver.maxlogs, default 32
>> files) can easily cause premature flushing as it is another criteria.
>> Given your write workload (which hit the WAL), this is definitely what
>> you're hitting.
>>
>> On Sat, Jun 16, 2012 at 7:47 PM, Amit Sela <[EMAIL PROTECTED]> wrote:
>>> Thanks Doug, I read the regions section from the book like you recommended
>>> but I still have some questions left.
>>>
>>> When running a massive write job, the regionserver log show the memsize
>>> that is flushed. The problem is that most of the time the memsize is either
>>> much smaller then the memstore.flush.size configured (resulting in writing
>>> more files, which leads to frequent compactions) or bigger
>>> than memstore.flush.size * memstore.block.multiplier (resulting in Blocking
>>> updates for 'IPC Server handler # on <port>...).
>>> In some cases I also see HBaseServer throwing a ClosedChannelException:
>>> "WARN org.apache.hadoop.ipc.HBaseServer: IPC Server handler <handler #> on
>>> <port #> caught: java.nio.channels.ClosedChannelException"
>>>
>>> I guess these problems are also the cause for long (few minutes) pauses and
>>> in extreme cases Full GC during the write jobs.
>>>
>>> Any ideas anyone ?
>>>
>>> In general, I did some digging and couldn't find much about the write
>>> process in HBase from a "memory usage" point of view... besides the
>>> configurations description - maybe worth adding to the book.
>>>
>>> Thank you for all your help,
>>>
>>> Amit.
>>>
>>>
>>> On Mon, Jun 11, 2012 at 3:22 PM, Doug Meil <[EMAIL PROTECTED]>wrote:
>>>
>>>>
>>>> Hi there-
>>>>
>>>> Your understanding is on track.
>>>>
>>>>
>>>> You probably want to read this section..
>>>>
>>>> http://hbase.apache.org/book.html#regions.arch
>>>>
>>>> Š as it covers those topics in more detail.
>>>>
>>>>
>>>>
>>>>
>>>> On 6/10/12 1:02 PM, "Amit Sela" <[EMAIL PROTECTED]> wrote:
>>>>
>>>>> Hi all,
>>>>>
>>>>> I'm trying to better understand what's going on in the region server
>>>>> during
>>>>> write to HBase.
>>>>>
>>>>> As I understand the process:
>>>>>
>>>>> 1. Data is written to memstore.
>>>>> 2. Once the memstore has reached hbase.hregion.memstore.flush.size ->
>>>>> memstore executes flush and writes a new StoreFile.
>>>>> 3. The number of StoreFiles increases until a compaction is triggered.
>>>>>
>>>>> To my understanding, the compaction is triggered after a compaction check
>>>>> is done by either CheckCompaction thread running in the background or by
>>>>> the flush memstore executed.
>>>>> The compaction triggered will be a minor compaction BUT it could promote
>>>>> to
>>>>> major if it includes all store files.
>>>>> When will it NOT include all store files ? say I set compactionThreshld to
>>>>> 3, then when the 3rd (or 4th) flush is executed, a compaction wiil be
>>>>> triggered and will promote to major since it includes all store files.
>>>>>
>>>>> Is this right ? can anyone elaborate ?
>>>>
>>>>
>>>>
>>
>>
>>
>> --
>> Harsh J

--
Harsh J