Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - Slow region server recoveries


Copy link to this message
-
Re: Slow region server recoveries
Varun Sharma 2013-04-19, 01:37
I am wondering if DFSClient caches the data node for a long period of time ?

Varun
On Thu, Apr 18, 2013 at 6:01 PM, Varun Sharma <[EMAIL PROTECTED]> wrote:

> Hi,
>
> We are facing problems with really slow HBase region server recoveries ~
> 20 minuted. Version is hbase 0.94.3 compiled with hadoop.profile=2.0.
>
> Hadoop version is CDH 4.2 with HDFS 3703 and HDFS 3912 patched and stale
> node timeouts configured correctly. Time for dead node detection is still
> 10 minutes.
>
> We see that our region server is trying to read an HLog is stuck there for
> a long time. Logs here:
>
> 2013-04-12 21:14:30,248 WARN org.apache.hadoop.hdfs.DFSClient: Failed to
> connect to /10.156.194.251:50010 for file
> /hbase/feeds/fbe25f94ed4fa37fb0781e4a8efae142/home/1d102c5238874a5d82adbcc09bf06599
> for block
> BP-696828882-10.168.7.226-1364886167971:blk_-3289968688911401881_9428:java.net.SocketTimeoutException:
> 15000 millis timeout while waiting for channel to be ready for read. ch :
> java.nio.channels.SocketChannel[connected local=/10.156.192.173:52818remote=/
> 10.156.194.251:50010]
>
> I would think that HDFS 3703 would make the server fail fast and go to the
> third datanode. Currently, the recovery seems way too slow for production
> usage...
>
> Varun
>