Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Slow random reads, SocketTimeoutExceptions


Copy link to this message
-
Re: Slow random reads, SocketTimeoutExceptions
A cell-size is about 300 bytes in my case. (row-key's length is 32 bytes)
In my current scenario, I generated 100 tables with a single column family.
I'm inserting between 100k and 300k rows per second, depending on my
settings - but it's not the purpose here, I'm mostly trying to get good
concurrent (random) read/write performances. My benchmark is running on 30
nodes.

On Wed, Jul 11, 2012 at 10:22 PM, Asaf Mesika <[EMAIL PROTECTED]> wrote:

> What's your cell value size?
> What do you mean by 100 tables in one column family?
> Can you please specify what was your insert rate and how many nodes you
> have?
>
> Sent from my iPhone
>
> On 11 ביול 2012, at 22:08, Adrien Mogenet <[EMAIL PROTECTED]>
> wrote:
>
> Hi there,
>
> I'm discovering HBase and comparing it with other distributed database I
> know much better. I am currently stressing my testing platform (servers
> with 32 GB Ram, 16 GB allocated to HBase JVM) and I'm observing strange
> performances... I'm putting tons of well-spred data (100 Tables of 100M
> rows in a single column family) and then I'm performing random reads. I get
> good read performances while the table does not have too much data in it,
> but in a big table, I only get around 100/300 qps. I'm not swapping, don't
> see any long pauses due to GC and insert rate is still very high, but
> nothing come from reads and it often results in a SocketTimeoutException
> (while waiting for channel to be ready for read exceptions, etc.).
>
> I noticed that certain StoreFile were very big (~120 GB) and I adjusted
> compaction strategy to no compact such big files (I don't know if it can be
> related to my issue).
>
> I noticed that when I'm stressing my cluster with Get requests, everything
> *looks* fine until a RegionServer does not yield a data locally and fetch
> it from HDFS, resulting in high and long network use, more than 60 seconds,
> that's throwing SocketTimeoutException).
>
> How does HBase handle data locality for random accesses ? Could it be a
> lead to solve this kind of issue ?
> My block cache of 5 GB is not full at all...
>
> --
> Adrien Mogenet
> http://www.mogenet.me
>

--
Adrien Mogenet
06.59.16.64.22
http://www.mogenet.me
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB