On Sat, Nov 10, 2012 at 7:03 AM, yun peng <[EMAIL PROTECTED]> wrote:
> Hi, I want to profile the # of disk access (both random and sequential)
> issued from HBase (into HDFS). For disk reads, I have tried use
> blockCacheMissCount, which seems working. But is it the correct way for
> reads (I can't confirmed it from HBase documents)?
That should give you a coarse measure. If you need better than that
you may need to instrument the code some to dump more detailed metric
on whether random or sequential access. Beware that hdfs may be
reading from file system cache avoiding disk altogether on some reads.
> For disk writes, I can't find any metrics in HBase. How should one get disk
> writes in HBase?
We write when we append to the WAL and when we flush hfiles. If you
need actual disk accesses, you'll probably need to add some extra
emissions in the code; per WAL edit and then as we flush. Again, hdfs
flush/sync usually means flush from hbase to hdfs; more specifically
to datanode memory and not necessarily to disk (at least currently).