Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> RS unresponsive after series of deletes


Copy link to this message
-
RE: RS unresponsive after series of deletes
> What kind of a delete are you doing?  

A mixture of row and cell deletes.  Interestingly, the first 19
(successful) deletes were row deletes.  The client got hung up while
submitting its first batch of cell deletes.  However, I think the
cell/row distinction is a red herring as we've experienced this behavior
at least once with batches of exclusively row deletes.

> When you say 19 deletes, each of these is a batch delete?

Each of the 19 deletes is a call to HTable.delete(List<Delete>).  I
estimated there where about 144 Deletes in each batch.  In the cell
delete that failed, I estimate about 1000 column qualifiers per row for
a total of about 144k cells per batch.
  
> Could it be that a batch is doing a bunch at the one time and taking a
long time to complete?  

In order to issue the cell delete we scan each row's column keys for
matches to in-memory set of domain objects.  The code to construct the
delete is completing quickly.

I should add that most of our deletes are very fast. But on 3 occasions
thus far, they exceed 10min allotted by retry logic in client.

> Try making smaller batches?  Want to try thread dumping it when it
goes unresponsive?

I will try to reproduce w/ test harness.
  
> Do you have gc logging enabled?  Anything in the .out file at this
time when we are using CPU?

I don't see any GC related operations over 10s.  Here is log from time
of first failure to 20min after: http://pastebin.com/AUaULHcD

-Ted