Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> HBase random read performance


+
Ankit Jain 2013-04-13, 05:31
+
Ted Yu 2013-04-13, 15:16
+
Adrien Mogenet 2013-04-13, 16:00
+
Harsh J 2013-04-13, 17:02
Copy link to this message
-
Re: HBase random read performance
Hi Ankit,

Reads might be impacts by many specifications in your system

As proposed above, Bloom filter can help, but also caching, regions size
and splits, etc. If you have only this table in your cluster, and so only
16 regions, you might want to split your table into smaller pieces. Also,
what's about the size of you key? If it's a 1024 bytes key, 10 000 gets
equals 10MB of data to send.... Which is a lot.

Do you have more details on your usecase and your goals?

JM

2013/4/13 Harsh J <[EMAIL PROTECTED]>

> > We are getting very low random read performance while performing multi
> get
> from HBase.
>
> What are you exactly trying to test here though? 10000 random rows in
> a single multi-get action from a single application thread returning
> back the assembled list from across 5 server, in 17s, is an indicator
> of what, w.r.t. your application?
>
> On Sat, Apr 13, 2013 at 9:30 PM, Adrien Mogenet
> <[EMAIL PROTECTED]> wrote:
> > Using bloom filter is almost mandatory there;
> > You might also want to try Short Circuit Reads and be sure you get 100%
> > data locality (major_compact your table first)
> >
> >
> > On Sat, Apr 13, 2013 at 5:16 PM, Ted Yu <[EMAIL PROTECTED]> wrote:
> >
> >> Did you enable bloom filters ?
> >> See http://hbase.apache.org/book.html#schema.bloom
> >>
> >> Cheers
> >>
> >> On Fri, Apr 12, 2013 at 10:31 PM, Ankit Jain <[EMAIL PROTECTED]
> >> >wrote:
> >>
> >> > Hi All,
> >> >
> >> > We are using HBase 0.94.5 and Hadoop 1.0.4.
> >> >
> >> > We have HBase cluster of 5 nodes(5 regionservers and 1 master node).
> Each
> >> > regionserver has 8 GB RAM.
> >> >
> >> > We have loaded 25 millions records in HBase table, regions are
> pre-split
> >> > into 16 regions and all the regions are equally loaded.
> >> >
> >> > We are getting very low random read performance while performing multi
> >> get
> >> > from HBase.
> >> >
> >> > We are passing random 10000 row-keys as input, while HBase is taking
> >> around
> >> > 17 secs to return 10000 records.
> >> >
> >> > Please suggest some tuning to increase HBase read performance.
> >> >
> >> > Thanks,
> >> > Ankit Jain
> >> > iLabs
> >> >
> >> >
> >> >
> >> > --
> >> > Thanks,
> >> > Ankit Jain
> >> >
> >>
> >
> >
> >
> > --
> > Adrien Mogenet
> > http://www.borntosegfault.com
>
>
>
> --
> Harsh J
>
+
Anoop Sam John 2013-04-15, 10:17
+
Rishabh Agrawal 2013-04-15, 10:42
+
Ankit Jain 2013-04-15, 10:53
+
谢良 2013-04-15, 11:41
+
Ankit Jain 2013-04-15, 13:04
+
Doug Meil 2013-04-15, 13:21
+
Ted Yu 2013-04-15, 13:30
+
Ted Yu 2013-04-15, 14:13
+
Ted Yu 2013-04-15, 17:03
+
lars hofhansl 2013-04-16, 14:55
+
Liu, Raymond 2013-04-16, 07:49
+
Nicolas Liochon 2013-04-16, 08:22
+
Jean-Marc Spaggiari 2013-04-16, 11:01
+
Michel Segel 2013-04-17, 12:33
+
Håvard Wahl Kongsgård 2013-04-14, 22:19
+
Mohammad Tariq 2013-04-14, 22:39
+
Ted Yu 2013-07-08, 12:49
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB