Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> HBase scan performance decreases over time.

Copy link to this message
HBase scan performance decreases over time.

Every now and then we need to flatten our cluster and re-import all data
from log files (changes in data format, etc.) Afterwards we notice a
significant increase in scan performance. As data is added and shuffled
around between region servers, performance goes down again over time (say a
couple of weeks). Are there any routine operations that one should run
manually, or settings to activate in the HBase configuration to keep the
data well distributed? We use HBase 0.92 as part of a Cloudera4 cluster.

Thank you,