Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Client receives SocketTimeoutException (CallerDisconnected on RS)


Copy link to this message
-
Re: Client receives SocketTimeoutException (CallerDisconnected on RS)
Hi guys,

1/ I checked quickly the GC logs and saw nothing. Since I need very
fast lookup I set the zookeeper.session.timeout parameter to 10s to
consider the RS as dead after very short pauses, and that did not
occur.

2/ I did not check but I don't think I ran out of sockets since the
ulimit has been set very high, but I'll check !

3/ Benchmark can launch several R/W threads, but even the simplest
program leads to my issue :

Configuration config = HBaseConfiguration.create();
HTable table = new HTable(config, "test");
for (<1, 10, 100 or 1000>)
  getsList.add(new Get(<randomKey>)
table.get(getsList)
table.close()

4/ I will share more logs tomorrow to dig deeper, I personally need a
long STW-pause :-)

Cheers,

On Thu, Aug 23, 2012 at 7:49 PM, N Keywal <[EMAIL PROTECTED]> wrote:
> Hi Adrien,
>
> As well, if you can share the client code (number of threads, regions,
> is it a set of single get, or are they multi gets, this kind of
> stuff).
>
> Cheers,
>
> N.
>
>
> On Thu, Aug 23, 2012 at 7:40 PM, Jean-Daniel Cryans <[EMAIL PROTECTED]> wrote:
>> Hi Adrien,
>>
>> I would love to see the region server side of the logs while those
>> socket timeouts happen, also check the GC log, but one thing people
>> often hit while doing pure random read workloads with tons of clients
>> is running out of sockets because they are all stuck in CLOSE_WAIT.
>> You can check that by using lsof. There are other discussion on this
>> mailing list about it.
>>
>> J-D
>>
>> On Thu, Aug 23, 2012 at 10:24 AM, Adrien Mogenet
>> <[EMAIL PROTECTED]> wrote:
>>> Hi there,
>>>
>>> While I'm performing read-intensive benchmarks, I'm seeing storm of
>>> "CallerDisconnectedException" in certain RegionServers. As the
>>> documentation says, my client received a SocketTimeoutException
>>> (60000ms etc...) at the same time.
>>> It's always happening and I get very poor read-performances (from 10
>>> to 5000 reads/sc) in a 10 nodes cluster.
>>>
>>> My benchmark consists in several iterations launching 10, 100 and 1000
>>> Get requests on a given random rowkey with a single CF/qualifier.
>>> I'm using HBase 0.94.1 (a few commits before the official stable
>>> release) with Hadoop 1.0.3.
>>> Bloom filters have been enabled (at the rowkey level).
>>>
>>> I do not find very clear informations about these exceptions. From the
>>> reference guide :
>>>   (...) you should consider digging in a bit more if you aren't doing
>>> something to trigger them.
>>>
>>> Well... could you help me digging? :-)

--
AM