Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> RS unresponsive after series of deletes


Copy link to this message
-
RE: RS unresponsive after series of deletes
First off, J-D, thanks for helping me work through this.  You've
inspired some different angles and I think I've finally made it bleed in
a controlled way.

> - That data you are deleting needs to be read when you scan, like I
> said earlier a delete is in fact an insert in HBase and this isn't
> cleared up until a major compaction happens.

I manually compacted (via UI) the table that I deleted from.  The scan
times are still >10min.  When reading through each node's log, I see
some messages indicating the major compactions were going to be skipped.
Is it safe to say that hitting that 'Compact' button is just a
recommendation?  Is there an operation we can perform after a big delete
to guarantee that deletes get compacted away?

> Do you have scanner caching turned on? Just to be sure set
> scan.setCaching(1) and see if it makes any difference.

A bit confused here.  Under what conditions would you recommend setting
the scan caching to 1?  My read path doesn't know about whether a lot of
data was recently deleted so I can't disable it conditionally. I want
scan caching in general, I believe.

> Are you saying that you have Delete objects on which you did
> deleteColumn() 1000x? If so, look no further there's your problem.

I am calling deleteColumn() thousands of time per Delete object.

I can delete a row w/ 20k keys in ~2 sec. If I issue 10 of these (they
appear to fired off asynchronously by the client), the unresponsive RS
behavior ensues.  Here is a stack dump from a RS that is running at >90%
utilization as it processes my deletes:

http://pastebin.com/8y5x4xU7

Some logs around this time:

http://pastebin.com/UpPMbsmn

So, my takeaway is the RS don't like being slammed w/ 100s of thousands
cell deletes.  I can be more measured about these deletes going forward.
That the RSs don't handle this more gracefully sounds like a bug. At a
minimum, there appears to be a nonlinear response. What do you think?

 
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB