Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Help with continuous loading configuration


Copy link to this message
-
Re: Help with continuous loading configuration
You can set put.setWriteToWAL(false) to skip the write ahead logging which
slows down puts significantly.  But, you will lose data if a regionserver
crashes with data in its memstore.
On Wed, Nov 16, 2011 at 4:09 PM, Amit Jain <[EMAIL PROTECTED]> wrote:

> Hi Stack,
>
> Thanks for the feedback.  Comments inline ...
>
> On Wed, Nov 16, 2011 at 3:35 PM, Stack <[EMAIL PROTECTED]> wrote:
>
> > On Wed, Nov 16, 2011 at 3:26 PM, Amit Jain <[EMAIL PROTECTED]> wrote:
> > > Hi Lars,
> > >
> > > The keys are arriving in random order.  The HBase monitoring page shows
> > > evenly distributed load across all of the region servers.
> >
> > What kind of ops rates are you seeing?  They are running nice and
> > smooth across all servers?   No stuttering?   Whats your regionserver
> > logs look like?
> >
> > Are you presplitting your table or just letting hbase run and do up the
> > splits?
> >
>
> As far as I can tell, the operations look smooth across all servers.  We're
> not doing any pre-splitting, just letting HBase do the splits.
>
>
> > >  I didn't see
> > > anything weird in the gc logs, no mention of any failures.  I'm a
> little
> > > unclear about what the optimal values for the following properties
> should
> > > be:
> > >
> > > hbase.hstore.compactionThreshold
> >
> > Default is 3.  Look in regionserver logs.  See how many files you have
> > on average by region columnfamily (you could also look in filesystem).
> >  Are we constantly rewriting them?   If write only load mostly, you
> > might up this putting off compactions till more files around (but
> > looking in regionserver logs, if high write rate, we might be having
> > trouble keeping up with this default threshold anyways?).
> >
>
> Well, it looks like half of the regions are in the 25-32 file range and the
> other half just have 1 or 2 files.  This was when we ran it with a
> compactionThreshold of 15.
>
> How can I tell by looking at the region server logs if we're seeing a "high
> write rate" ?  We've got 48 clients sending load, 12 region servers total.
>  We're pushing the system pretty hard.
>
>
> > > hbase.hstore.blockingStoreFiles
> > >
> >
> > The higher this is, the bigger the price you'll pay if a server
> > crashes because this will be the upper bound on how many WAL logs we
> > need to split for the server before its regions come back on line
> > again.  Leave it default I'd say for now.
> >
>
> Ok, we'll leave it default for now.
>
>
> > > Is there some rule of thumb that I can use to determine good values for
> > > these properties?
> > >
> >
> > You've checked out this section of the book:
> > http://hbase.apache.org/book.html#performance
> >
> > Are you filling the machines?   Are they burning cpu?  Or io-bound?
> > If not, perhaps open the front gate wider by upping the number of
> > concurrent handlers.
> >
>
> I have read through that section of the HBase book.  There is plenty of CPU
> available.  How do I up the number of concurrent handlers?  Increase
> hbase.regionserver.handler.count ?
>
> - Amit
>