Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> The write process in the Region Server


Copy link to this message
-
Re: The write process in the Region Server
Hi all,

Thanks for all the help, I think I got it.
In addition to everyone's advice I also found a useful post regarding
stability and performance:
http://kisalay.com/2012/04/09/hbase-configurations/

That led me to configure a smaller memstore flush size of 128MB times 4
block multiplier but with more StoreFiles (20) and about 30 handlers.

My write job looks fluid (no long pauses) now and the heap is managed well,
the only thing I still don't get is the small flushes (less than 100MB and
sometimes less than 10MB) I get sometime...

Hope this digging helps someone in the future ;)

Thanks,

Amit.

On Sun, Jun 17, 2012 at 10:03 AM, Harsh J <[EMAIL PROTECTED]> wrote:

> Hi,
>
> No, in that case my comment can be considered incorrect. The HLog
> shouldn't fill up very fast - and your problem does sound memory bound
> now (upper/lower watermark hits).
>
> On Sun, Jun 17, 2012 at 11:49 AM, Infolinks <[EMAIL PROTECTED]> wrote:
> > Hi Harsh J,
> >
> > I'm not using WAL in my writes.
> > Is there still a log rolling ?
> >
> > ב-Jun 17, 2012, בשעה 7:40, Harsh J <[EMAIL PROTECTED]> כתב/ה:
> >
> >> Amit,
> >>
> >> Your values for HLog block size (hbase.regionserver.hlog.blocksize,
> >> default is the HDFS default block size (64 MB unless you've raised it
> >> properly), too low unless you also have HLog compression) and the
> >> factor of max-hlogs-to-keep (hbase.regionserver.maxlogs, default 32
> >> files) can easily cause premature flushing as it is another criteria.
> >> Given your write workload (which hit the WAL), this is definitely what
> >> you're hitting.
> >>
> >> On Sat, Jun 16, 2012 at 7:47 PM, Amit Sela <[EMAIL PROTECTED]> wrote:
> >>> Thanks Doug, I read the regions section from the book like you
> recommended
> >>> but I still have some questions left.
> >>>
> >>> When running a massive write job, the regionserver log show the memsize
> >>> that is flushed. The problem is that most of the time the memsize is
> either
> >>> much smaller then the memstore.flush.size configured (resulting in
> writing
> >>> more files, which leads to frequent compactions) or bigger
> >>> than memstore.flush.size * memstore.block.multiplier (resulting in
> Blocking
> >>> updates for 'IPC Server handler # on <port>...).
> >>> In some cases I also see HBaseServer throwing a ClosedChannelException:
> >>> "WARN org.apache.hadoop.ipc.HBaseServer: IPC Server handler <handler
> #> on
> >>> <port #> caught: java.nio.channels.ClosedChannelException"
> >>>
> >>> I guess these problems are also the cause for long (few minutes)
> pauses and
> >>> in extreme cases Full GC during the write jobs.
> >>>
> >>> Any ideas anyone ?
> >>>
> >>> In general, I did some digging and couldn't find much about the write
> >>> process in HBase from a "memory usage" point of view... besides the
> >>> configurations description - maybe worth adding to the book.
> >>>
> >>> Thank you for all your help,
> >>>
> >>> Amit.
> >>>
> >>>
> >>> On Mon, Jun 11, 2012 at 3:22 PM, Doug Meil <
> [EMAIL PROTECTED]>wrote:
> >>>
> >>>>
> >>>> Hi there-
> >>>>
> >>>> Your understanding is on track.
> >>>>
> >>>>
> >>>> You probably want to read this section..
> >>>>
> >>>> http://hbase.apache.org/book.html#regions.arch
> >>>>
> >>>> Š as it covers those topics in more detail.
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> On 6/10/12 1:02 PM, "Amit Sela" <[EMAIL PROTECTED]> wrote:
> >>>>
> >>>>> Hi all,
> >>>>>
> >>>>> I'm trying to better understand what's going on in the region server
> >>>>> during
> >>>>> write to HBase.
> >>>>>
> >>>>> As I understand the process:
> >>>>>
> >>>>> 1. Data is written to memstore.
> >>>>> 2. Once the memstore has reached hbase.hregion.memstore.flush.size ->
> >>>>> memstore executes flush and writes a new StoreFile.
> >>>>> 3. The number of StoreFiles increases until a compaction is
> triggered.
> >>>>>
> >>>>> To my understanding, the compaction is triggered after a compaction
> check
> >>>>> is done by either CheckCompaction thread running in the background
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB