Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> HBase secondary index performance


Copy link to this message
-
Re: HBase secondary index performance
First, check that you connection not in autoflash mode.
Second, you can think about custom indexing instead
of using indexedtable. In my experience custom idexing
(especially if data doesn't modified), is much more performant.
Problem with indexedtable is in fact, that on every insert
hbase performs one (random) get operation (to check, that we doesn't
have previous indexed data, and delete if it exists).  Random gets are
lays around 100-400 request per node, so you get 60 looks good
(for indexedtable).

How to build custom indexes you can read
http://brunodumon.wordpress.com/2010/02/17/building-indexes-using-hbase-mapping-strings-numbers-and-dates-onto-bytes/

2010/9/2 Murali Krishna. P <[EMAIL PROTECTED]>:
> Hi,
>    I have an indexedtable with index on around 20 columns. The write
> performance on the original table is around 60 per second. This is just a one
> node setup. Even with mutiple parallel clients, I am getting just 60
> writes/second. That means a total write of 60 * 20 = 1200 writes/second due to
> 20 indextables? This is not good enough for our application. Is this number 1200
> look right ? I was expecting around 15k.
>    I am using 0.20.6 HBase on 0.20.2 Hadoop. hardware config (8g ram, 2core,
> 7.2k rpm disk). Will adding nodes increase the writes linearly?
>
>  Thanks,
> Murali Krishna
>