Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> HBaseClient.call() hang


Copy link to this message
-
Re: HBaseClient.call() hang
Hey Bryan,
which version of HBase it this?

-- Lars

________________________________
 From: Bryan Keller <[EMAIL PROTECTED]>
To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
Sent: Friday, December 14, 2012 2:59 PM
Subject: HBaseClient.call() hang
 
I have encountered a problem with HBaseClient.call() hanging. This occurs when one of my regionservers goes down while performing a table scan.

What exacerbates this problem is that the scan I am performing uses filters, and the region size of the table is large (4gb). Because of this, it can take several minutes for a row to be returned when calling scanner.next(). Apparently there is no keep alive message being sent back to the scanner while the region server is busy, so I had to increase the hbase.rpc.timeout value to a large number (60 min), otherwise the next() call will timeout waiting for the regionserver to send something back.

The result is that this HBaseClient.call() hang is made much worse, because it won't time out for 60 minutes.

I have a couple of questions:

1. Any thoughts on why the HBaseClient.call() is getting stuck? I noticed that call.wait() is not using any timeout so it will wait indefinitely until interrupted externally

2. Is there a solution where I do not need to set hbase.rpc.timeout to a very large number? My only thought would be to forego using filters and do the filtering client side, which seems pretty inefficient

Here is a stack dump of the thread that was hung:

Thread 10609: (state = BLOCKED)
- java.lang.Object.wait(long) @bci=0 (Interpreted frame)
- java.lang.Object.wait() @bci=2, line=485 (Interpreted frame)
- org.apache.hadoop.hbase.ipc.HBaseClient.call(org.apache.hadoop.io.Writable, java.net.InetSocketAddress, java.lang.Class, org.apache.hadoop.hbase.security.User, int) @bci=51, line=904 (Interpreted frame)
- org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(java.lang.Object, java.lang.reflect.Method, java.lang.Object[]) @bci=52, line=150 (Interpreted frame)
- $Proxy12.next(long, int) @bci=26 (Interpreted frame)
- org.apache.hadoop.hbase.client.ScannerCallable.call() @bci=72, line=92 (Interpreted frame)
- org.apache.hadoop.hbase.client.ScannerCallable.call() @bci=1, line=42 (Interpreted frame)
- org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithRetries(org.apache.hadoop.hbase.client.ServerCallable) @bci=36, line=1325 (Interpreted frame)
- org.apache.hadoop.hbase.client.HTable$ClientScanner.next() @bci=117, line=1299 (Compiled frame)
- org.apache.hadoop.hbase.mapreduce.TableRecordReaderImpl.nextKeyValue() @bci=41, line=150 (Interpreted frame)
- org.apache.hadoop.hbase.mapreduce.TableRecordReader.nextKeyValue() @bci=4, line=142 (Interpreted frame)
- org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue() @bci=4, line=458 (Interpreted frame)
- org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue() @bci=4, line=76 (Interpreted frame)
- org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue() @bci=4, line=85 (Interpreted frame)
- org.apache.hadoop.mapreduce.Mapper.run(org.apache.hadoop.mapreduce.Mapper$Context) @bci=6, line=139 (Interpreted frame)
- org.apache.hadoop.mapred.MapTask.runNewMapper(org.apache.hadoop.mapred.JobConf, org.apache.hadoop.mapreduce.split.JobSplit$TaskSplitIndex, org.apache.hadoop.mapred.TaskUmbilicalProtocol, org.apache.hadoop.mapred.Task$TaskReporter) @bci=201, line=645 (Interpreted frame)
- org.apache.hadoop.mapred.MapTask.run(org.apache.hadoop.mapred.JobConf, org.apache.hadoop.mapred.TaskUmbilicalProtocol) @bci=100, line=325 (Interpreted frame)
- org.apache.hadoop.mapred.Child$4.run() @bci=29, line=268 (Interpreted frame)
- java.security.AccessController.doPrivileged(java.security.PrivilegedExceptionAction, java.security.AccessControlContext) @bci=0 (Interpreted frame)
- javax.security.auth.Subject.doAs(javax.security.auth.Subject, java.security.PrivilegedExceptionAction) @bci=42, line=396 (Interpreted frame)
- org.apache.hadoop.security.UserGroupInformation.doAs(java.security.PrivilegedExceptionAction) @bci=14, line=1332 (Interpreted frame)
- org.apache.hadoop.mapred.Child.main(java.lang.String[]) @bci=776, line=262 (Interpreted frame)