I have made the new miniBatchDelete () and made the HTable#delete(List<Delete>) to call this new batch delete.
Just tested initially with the one node cluster. In that itself I am getting a performance boost which is very much promising.
Only one CF and qualifier.
10K total rows delete with a batch of 100 deletes. Only deletes happening on the table from one thread.
With the new way the net time taken is reduced by more than 1/10
Will test in a 4 node cluster also. I think it will worth doing this change.
From: [EMAIL PROTECTED] [[EMAIL PROTECTED]]
Sent: Wednesday, June 20, 2012 6:31 PM
To: [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Subject: Re: Can there be a doMiniBatchDelete in HRegion?
I think you can issue large number of deletes on the same region and observe whether the proposed new method gives us performance boost.
On Jun 20, 2012, at 2:49 AM, Anoop Sam John <[EMAIL PROTECTED]> wrote:
> Hi Devs
> There is a batch put support in the HRegion level. When the put(List<Put>) happens from client, Puts corresponding to one region might get grouped together and handled as a batch.[Depending on the availability of rowlocks.. code in HRegion#doMiniBatchPut] For this batch there will be single write and sync into the HLog file.
> A similar kind of delete operation, I am not able to see in HRegion. The HTable#delete(List<Delete>) groups the Deletes for the same RS and make one n/w call only. But within the RS, there will be N number of delete calls on the region one by one. This will include N number of HLog write and sync. If this also can be grouped can we get better performance for the multi row delete. Is there any problem in doing this batch delete? I am not sure any JIRA is already present for this.
> Note : Hregion#mutateRowsWithLock().. we do batch operations of Puts and Deletes(also)