Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> HBaseClient.call() hang

Copy link to this message
HBaseClient.call() hang
I have encountered a problem with HBaseClient.call() hanging. This occurs when one of my regionservers goes down while performing a table scan.

What exacerbates this problem is that the scan I am performing uses filters, and the region size of the table is large (4gb). Because of this, it can take several minutes for a row to be returned when calling scanner.next(). Apparently there is no keep alive message being sent back to the scanner while the region server is busy, so I had to increase the hbase.rpc.timeout value to a large number (60 min), otherwise the next() call will timeout waiting for the regionserver to send something back.

The result is that this HBaseClient.call() hang is made much worse, because it won't time out for 60 minutes.

I have a couple of questions:

1. Any thoughts on why the HBaseClient.call() is getting stuck? I noticed that call.wait() is not using any timeout so it will wait indefinitely until interrupted externally

2. Is there a solution where I do not need to set hbase.rpc.timeout to a very large number? My only thought would be to forego using filters and do the filtering client side, which seems pretty inefficient

Here is a stack dump of the thread that was hung:

Thread 10609: (state = BLOCKED)
 - java.lang.Object.wait(long) @bci=0 (Interpreted frame)
 - java.lang.Object.wait() @bci=2, line=485 (Interpreted frame)
 - org.apache.hadoop.hbase.ipc.HBaseClient.call(org.apache.hadoop.io.Writable, java.net.InetSocketAddress, java.lang.Class, org.apache.hadoop.hbase.security.User, int) @bci=51, line=904 (Interpreted frame)
 - org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(java.lang.Object, java.lang.reflect.Method, java.lang.Object[]) @bci=52, line=150 (Interpreted frame)
 - $Proxy12.next(long, int) @bci=26 (Interpreted frame)
 - org.apache.hadoop.hbase.client.ScannerCallable.call() @bci=72, line=92 (Interpreted frame)
 - org.apache.hadoop.hbase.client.ScannerCallable.call() @bci=1, line=42 (Interpreted frame)
 - org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithRetries(org.apache.hadoop.hbase.client.ServerCallable) @bci=36, line=1325 (Interpreted frame)
 - org.apache.hadoop.hbase.client.HTable$ClientScanner.next() @bci=117, line=1299 (Compiled frame)
 - org.apache.hadoop.hbase.mapreduce.TableRecordReaderImpl.nextKeyValue() @bci=41, line=150 (Interpreted frame)
 - org.apache.hadoop.hbase.mapreduce.TableRecordReader.nextKeyValue() @bci=4, line=142 (Interpreted frame)
 - org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue() @bci=4, line=458 (Interpreted frame)
 - org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue() @bci=4, line=76 (Interpreted frame)
 - org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue() @bci=4, line=85 (Interpreted frame)
 - org.apache.hadoop.mapreduce.Mapper.run(org.apache.hadoop.mapreduce.Mapper$Context) @bci=6, line=139 (Interpreted frame)
 - org.apache.hadoop.mapred.MapTask.runNewMapper(org.apache.hadoop.mapred.JobConf, org.apache.hadoop.mapreduce.split.JobSplit$TaskSplitIndex, org.apache.hadoop.mapred.TaskUmbilicalProtocol, org.apache.hadoop.mapred.Task$TaskReporter) @bci=201, line=645 (Interpreted frame)
 - org.apache.hadoop.mapred.MapTask.run(org.apache.hadoop.mapred.JobConf, org.apache.hadoop.mapred.TaskUmbilicalProtocol) @bci=100, line=325 (Interpreted frame)
 - org.apache.hadoop.mapred.Child$4.run() @bci=29, line=268 (Interpreted frame)
 - java.security.AccessController.doPrivileged(java.security.PrivilegedExceptionAction, java.security.AccessControlContext) @bci=0 (Interpreted frame)
 - javax.security.auth.Subject.doAs(javax.security.auth.Subject, java.security.PrivilegedExceptionAction) @bci=42, line=396 (Interpreted frame)
 - org.apache.hadoop.security.UserGroupInformation.doAs(java.security.PrivilegedExceptionAction) @bci=14, line=1332 (Interpreted frame)
 - org.apache.hadoop.mapred.Child.main(java.lang.String[]) @bci=776, line=262 (Interpreted frame)
lars hofhansl 2012-12-15, 01:31
Bryan Keller 2012-12-15, 05:29
Ted Yu 2012-12-15, 05:49
Ted Yu 2012-12-15, 05:59
Bijieshan 2012-12-17, 01:31
Bryan Keller 2012-12-17, 17:18
Azuryy Yu 2012-12-18, 00:50
Mesika, Asaf 2012-12-18, 20:27
Bryan Keller 2012-12-18, 22:25
Bryan Keller 2012-12-18, 20:05