Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> HBase scan performance decreases over time.


Copy link to this message
-
Re: HBase scan performance decreases over time.
Hello Ted,

We never initiate major compaction manually. I have not looked at I/O
balance between nodes in detail. We have noticed that after running for a
couple of weeks HBase seems to spend hours pushing blocks between nodes in
order to optimize things. We add data daily in one ~30gb push to several
tables. Sometimes nodes get added to the running system.

Where can I get more information on how to carry out performance related
HBase administrative tasks?

Thank you,

/David
On Sat, Nov 3, 2012 at 4:42 PM, Ted Yu <[EMAIL PROTECTED]> wrote:

> Can you tell us how often you run major compaction after the import ?
> Have you noticed imbalanced read / write requests in the cluster ? Meaning
> subset of region servers receive bulk of the writes.
>
> We do some manual movement of regions when the above happens.
>
> Cheers
>
> On Sat, Nov 3, 2012 at 8:12 AM, David Koch <[EMAIL PROTECTED]> wrote:
>
> > Hello,
> >
> > Every now and then we need to flatten our cluster and re-import all data
> > from log files (changes in data format, etc.) Afterwards we notice a
> > significant increase in scan performance. As data is added and shuffled
> > around between region servers, performance goes down again over time
> (say a
> > couple of weeks). Are there any routine operations that one should run
> > manually, or settings to activate in the HBase configuration to keep the
> > data well distributed? We use HBase 0.92 as part of a Cloudera4 cluster.
> >
> > Thank you,
> >
> > /David
> >
>