Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> flushing + compactions after config change


Copy link to this message
-
flushing + compactions after config change
Hi All,

I wanted some help on understanding what's going on with my current setup.
I updated from config to the following settings:

  <property>
    <name>hbase.hregion.max.filesize</name>
    <value>107374182400</value>
  </property>

  <property>
    <name>hbase.hregion.memstore.block.multiplier</name>
    <value>4</value>
  </property>

  <property>
    <name>hbase.hregion.memstore.flush.size</name>
    <value>134217728</value>
  </property>

  <property>
    <name>hbase.hstore.blockingStoreFiles</name>
    <value>50</value>
  </property>

  <property>
    <name>hbase.hregion.majorcompaction</name>
    <value>0</value>
  </property>

Prior to this, all the settings were default values. I wanted to increase
the write throughput on my system and also control when major compactions
happen. In addition to that, I wanted to make sure that my regions don't
split quickly.

After the change in settings, I am seeing a huge storm of memstore flushing
and minor compactions some of which get promoted to major compaction. The
compaction queue is also way too high. For example a few of the line that I
see in the logs are as follows:

http://pastebin.com/Gv1S9GKX

The regionserver whose logs are pasted above keeps on flushing and creating
those small files shows the follwoing metrics:
memstoreSizeMB=657, compactionQueueSize=233, flushQueueSize=0,
usedHeapMB=3907, maxHeapMB=10231

I am unsure why it's causing such high amount of flush (< 100m) even though
the flush size is at 128m and there is no memory pressure.

Any thoughts ? Let me know if you need any more information, I also have
ganglia running and can provide more metrics if needed.

Thanks,
Viral
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB