Sorry for the novel, also I am not fully awake so hopefully all of this
Something like that(peak/off peak settings) does not exist, but a
properly sized and tuned system should not experience this behavior. The
blog you linked(great blog btw) covers a fair portion of it. You have to
look at it on whole. If you have a peak that you cannot sustain, you are
probably undersized as you need a cluster that can handle peak
+10%(arbitrary number, but to make a point). I would be curious to take a
look at your logs during the peak time when you are seeing blocking.
Just raising the memstore flush size won't help you if you are already
flushing at too small of a size. For example:
Memstore = 128MB
Total heap 10GB
memstore upper limit is .4
total heap devoted to memstore is 4GB
100 active regions per region server
4096MB / 100regions = ~41MB MAX per region
In the above case if you raise your memstore flush size to 256MB, then
nothing is gained since our bottle neck was flush size. The bottle neck
was heap based, so we either need to raise our heap, allocate more to our
upper/lower limit, or lower region count.
Another aspect I look at is Hlog count/size. You want to try to size the
total numbers of HLogs * the size of HLogs to be equal to your memstore
flush size so that they roll right around the same time. If you don't you
will have big Memstores(256/512MB) flush sizes, but your HLogs will roll
and cause premature small flushes. This will also cause more flushes,
hence more compactions, and can lead to blocking.
Raising the blocking number, is typically a last resort for me. I do
think 7 is too low of a number and I usually set systems to 15. If you
just raise this to 100 or even 1000, you are just masking the issue. Also
if you get too far behind it can fall so far behind you would not be able
to catch up.
There is also a chance that you are trying to do too much with too
little. Like I said before, always size your system for your peak loads.
On Sun, Jun 9, 2013 at 10:17 PM, yun peng <[EMAIL PROTECTED]> wrote:
> thanks lars for the insights. I guess current hbase may have to block write
> stream even when data write rate does not reach the limit of IO subsystems.
> Blocking happen because of the compaction which is so consuming and has to
> be invoked synchronously (say to keep #hfile < K), then the invocation of
> compaction could block write stream..? (correct me if I am wrong).
> On Sun, Jun 9, 2013 at 7:33 PM, lars hofhansl <[EMAIL PROTECTED]> wrote:
> > One thing to keep in mind is that this typically happen when you write
> > faster than your IO subsystems can support.
> > For a while HBase will absorb this by buffering in the memstore, but if
> > you sustain the write load something will have to slow down the writers.
> > Granted, this could be done a bit more graceful.
> > -- Lars
> > ________________________________
> > From: yun peng <[EMAIL PROTECTED]>
> > To: [EMAIL PROTECTED]
> > Sent: Sunday, June 9, 2013 6:28 AM
> > Subject: Hbase write stream blocking and any solutions?
> > Hi, All
> > HBase could block the online write operations when there are too many
> > in memstore (to be more efficient for the potential compaction incurred
> > this flush when there're many files on disk). This blocking effect is
> > observed by others (e.g.,
> > http://gbif.blogspot.com/2012/07/optimizing-writes-in-hbase.html).
> > The solution come up with on the above web blog is to increase the
> > size with fewer # of flushes, and to tolerate bigger # of files on disk
> > increasing blockingStoreFiles). This is a kind of HBase tuning towards
> > write intensive workload.
> > My targeted application has dynamical workload which may changes from
> > write-intensive to read-intensive. Also there are peak hours (when
> > is user perceivable and should not be invoked) and offpeak hours (when
Systems Engineer, Cloudera