Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - Slow random reads, SocketTimeoutExceptions


Copy link to this message
-
Slow random reads, SocketTimeoutExceptions
Adrien Mogenet 2012-07-11, 19:07
Hi there,

I'm discovering HBase and comparing it with other distributed database I
know much better. I am currently stressing my testing platform (servers
with 32 GB Ram, 16 GB allocated to HBase JVM) and I'm observing strange
performances... I'm putting tons of well-spred data (100 Tables of 100M
rows in a single column family) and then I'm performing random reads. I get
good read performances while the table does not have too much data in it,
but in a big table, I only get around 100/300 qps. I'm not swapping, don't
see any long pauses due to GC and insert rate is still very high, but
nothing come from reads and it often results in a SocketTimeoutException
(while waiting for channel to be ready for read exceptions, etc.).

I noticed that certain StoreFile were very big (~120 GB) and I adjusted
compaction strategy to no compact such big files (I don't know if it can be
related to my issue).

I noticed that when I'm stressing my cluster with Get requests, everything
*looks* fine until a RegionServer does not yield a data locally and fetch
it from HDFS, resulting in high and long network use, more than 60 seconds,
that's throwing SocketTimeoutException).

How does HBase handle data locality for random accesses ? Could it be a
lead to solve this kind of issue ?
My block cache of 5 GB is not full at all...

--
Adrien Mogenet
http://www.mogenet.me