Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> hbase delete operation is very slow


+
Haijia Zhou 2012-02-21, 20:52
+
Doug Meil 2012-02-21, 22:45
+
Stack 2012-02-22, 00:39
+
Doug Meil 2012-02-22, 01:54
+
Stack 2012-02-22, 02:13
Copy link to this message
-
RE: hbase delete operation is very slow
Thanks for the suggestion. I did use List<Delete> with size 1000, actually the performance was not that different from deleting one row at a time.
I investigated HRegion.delete() method, my understanding is that when you call delete() to delete a row, it's actually going to delete all the column families for that row first, meaning it'll put tombstone to each family column.
In my case each row has 5 family columns, that means each delete will result in putting 5 tombstones to the row, I am thinking that could be the reason why delete is so slow.

I  am just wondering if there's anyway or tools we can profile a hbase application to measure the time taken on each individual methods.

Haijia

-----Original Message-----
From: Doug Meil [mailto:[EMAIL PROTECTED]]
Sent: Tuesday, February 21, 2012 8:54 PM
To: [EMAIL PROTECTED]
Subject: Re: hbase delete operation is very slow
I don't think write-buffering is an option because that's Put-only the last time I looked, but the advice I put in the book is to use the delete(List<Delete>).  He'll have to keep track of the List<Delete> himself and determine when the batch should be sent, but it's a lot better than one at a time.
On 2/21/12 7:39 PM, "Stack" <[EMAIL PROTECTED]> wrote:

>On Tue, Feb 21, 2012 at 2:45 PM, Doug Meil
><[EMAIL PROTECTED]> wrote:
>>
>> Hi there-
>>
>> You probably want to see this...
>>
>> http://hbase.apache.org/book.html#perf.deleting
>>
>> .. that particular method doesn't use the write-buffer and is
>> submitting deletes one-by-one to the RS's.
>>
>>
>
>Do what Doug suggests.  Sounds like you are setting up a Map per row
>and then per row, figuring whether to Delete.  If a Delete, you do an
>invocation per.  Where are you getting your table instance from?  Is it
>created each time?  And as per Doug, are you write buffering your
>deletes?
>
>St.Ack
>
+
Ioan Eugen Stan 2012-02-23, 14:57
+
Daniel Iancu 2012-02-23, 18:36
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB