|
|
-
Can there be a doMiniBatchDelete in HRegion?
Anoop Sam John 2012-06-20, 09:49
Hi Devs
There is a batch put support in the HRegion level. When the put(List<Put>) happens from client, Puts corresponding to one region might get grouped together and handled as a batch.[Depending on the availability of rowlocks.. code in HRegion#doMiniBatchPut] For this batch there will be single write and sync into the HLog file.
A similar kind of delete operation, I am not able to see in HRegion. The HTable#delete(List<Delete>) groups the Deletes for the same RS and make one n/w call only. But within the RS, there will be N number of delete calls on the region one by one. This will include N number of HLog write and sync. If this also can be grouped can we get better performance for the multi row delete. Is there any problem in doing this batch delete? I am not sure any JIRA is already present for this.
Note : Hregion#mutateRowsWithLock().. we do batch operations of Puts and Deletes(also)
-Anoop-
-
Re: Can there be a doMiniBatchDelete in HRegion?
yuzhihong@... 2012-06-20, 13:01
I think you can issue large number of deletes on the same region and observe whether the proposed new method gives us performance boost.
Thanks
On Jun 20, 2012, at 2:49 AM, Anoop Sam John <[EMAIL PROTECTED]> wrote:
> Hi Devs > > There is a batch put support in the HRegion level. When the put(List<Put>) happens from client, Puts corresponding to one region might get grouped together and handled as a batch.[Depending on the availability of rowlocks.. code in HRegion#doMiniBatchPut] For this batch there will be single write and sync into the HLog file. > > > > A similar kind of delete operation, I am not able to see in HRegion. The HTable#delete(List<Delete>) groups the Deletes for the same RS and make one n/w call only. But within the RS, there will be N number of delete calls on the region one by one. This will include N number of HLog write and sync. If this also can be grouped can we get better performance for the multi row delete. Is there any problem in doing this batch delete? I am not sure any JIRA is already present for this. > > > > Note : Hregion#mutateRowsWithLock().. we do batch operations of Puts and Deletes(also) > > > > -Anoop-
-
RE: Can there be a doMiniBatchDelete in HRegion?
Anoop Sam John 2012-06-21, 03:30
Sure Ted. I will test and inform the result.
-Anoop- ________________________________________ From: [EMAIL PROTECTED] [[EMAIL PROTECTED]] Sent: Wednesday, June 20, 2012 6:31 PM To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Subject: Re: Can there be a doMiniBatchDelete in HRegion?
I think you can issue large number of deletes on the same region and observe whether the proposed new method gives us performance boost.
Thanks
On Jun 20, 2012, at 2:49 AM, Anoop Sam John <[EMAIL PROTECTED]> wrote:
> Hi Devs > > There is a batch put support in the HRegion level. When the put(List<Put>) happens from client, Puts corresponding to one region might get grouped together and handled as a batch.[Depending on the availability of rowlocks.. code in HRegion#doMiniBatchPut] For this batch there will be single write and sync into the HLog file. > > > > A similar kind of delete operation, I am not able to see in HRegion. The HTable#delete(List<Delete>) groups the Deletes for the same RS and make one n/w call only. But within the RS, there will be N number of delete calls on the region one by one. This will include N number of HLog write and sync. If this also can be grouped can we get better performance for the multi row delete. Is there any problem in doing this batch delete? I am not sure any JIRA is already present for this. > > > > Note : Hregion#mutateRowsWithLock().. we do batch operations of Puts and Deletes(also) > > > > -Anoop-
-
RE: Can there be a doMiniBatchDelete in HRegion?
Anoop Sam John 2012-06-25, 13:50
I have made the new miniBatchDelete () and made the HTable#delete(List<Delete>) to call this new batch delete. Just tested initially with the one node cluster. In that itself I am getting a performance boost which is very much promising. Only one CF and qualifier. 10K total rows delete with a batch of 100 deletes. Only deletes happening on the table from one thread. With the new way the net time taken is reduced by more than 1/10 Will test in a 4 node cluster also. I think it will worth doing this change.
-Anoop- ________________________________________ From: [EMAIL PROTECTED] [[EMAIL PROTECTED]] Sent: Wednesday, June 20, 2012 6:31 PM To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Subject: Re: Can there be a doMiniBatchDelete in HRegion?
I think you can issue large number of deletes on the same region and observe whether the proposed new method gives us performance boost.
Thanks
On Jun 20, 2012, at 2:49 AM, Anoop Sam John <[EMAIL PROTECTED]> wrote:
> Hi Devs > > There is a batch put support in the HRegion level. When the put(List<Put>) happens from client, Puts corresponding to one region might get grouped together and handled as a batch.[Depending on the availability of rowlocks.. code in HRegion#doMiniBatchPut] For this batch there will be single write and sync into the HLog file. > > > > A similar kind of delete operation, I am not able to see in HRegion. The HTable#delete(List<Delete>) groups the Deletes for the same RS and make one n/w call only. But within the RS, there will be N number of delete calls on the region one by one. This will include N number of HLog write and sync. If this also can be grouped can we get better performance for the multi row delete. Is there any problem in doing this batch delete? I am not sure any JIRA is already present for this. > > > > Note : Hregion#mutateRowsWithLock().. we do batch operations of Puts and Deletes(also) > > > > -Anoop-
-
Re: Can there be a doMiniBatchDelete in HRegion?
Ted Yu 2012-06-25, 13:56
After testing in the cluster, please open a JIRA and attach result there.
Thanks for your effort, Anoop.
On Mon, Jun 25, 2012 at 6:50 AM, Anoop Sam John <[EMAIL PROTECTED]> wrote:
> I have made the new miniBatchDelete () and made the > HTable#delete(List<Delete>) to call this new batch delete. > Just tested initially with the one node cluster. In that itself I am > getting a performance boost which is very much promising. > Only one CF and qualifier. > 10K total rows delete with a batch of 100 deletes. Only deletes happening > on the table from one thread. > With the new way the net time taken is reduced by more than 1/10 > Will test in a 4 node cluster also. I think it will worth doing this > change. > > -Anoop- > ________________________________________ > From: [EMAIL PROTECTED] [[EMAIL PROTECTED]] > Sent: Wednesday, June 20, 2012 6:31 PM > To: [EMAIL PROTECTED] > Cc: [EMAIL PROTECTED] > Subject: Re: Can there be a doMiniBatchDelete in HRegion? > > I think you can issue large number of deletes on the same region and > observe whether the proposed new method gives us performance boost. > > Thanks > > > > On Jun 20, 2012, at 2:49 AM, Anoop Sam John <[EMAIL PROTECTED]> wrote: > > > Hi Devs > > > > There is a batch put support in the HRegion level. When the > put(List<Put>) happens from client, Puts corresponding to one region might > get grouped together and handled as a batch.[Depending on the availability > of rowlocks.. code in HRegion#doMiniBatchPut] For this batch there will > be single write and sync into the HLog file. > > > > > > > > A similar kind of delete operation, I am not able to see in HRegion. The > HTable#delete(List<Delete>) groups the Deletes for the same RS and make one > n/w call only. But within the RS, there will be N number of delete calls on > the region one by one. This will include N number of HLog write and sync. > If this also can be grouped can we get better performance for the multi row > delete. Is there any problem in doing this batch delete? I am not sure any > JIRA is already present for this. > > > > > > > > Note : Hregion#mutateRowsWithLock().. we do batch operations of Puts and > Deletes(also) > > > > > > > > -Anoop- >
-
Re: Can there be a doMiniBatchDelete in HRegion?
Ted Yu 2012-06-25, 16:11
>From another thread, the following is related to the optimization Anoop is testing:
In HRegionServer: public <R> MultiResponse multi(MultiAction<R> multi) throws IOException { ... for (Action<R> a : actionsForRegion) { action = a.getAction(); ... if (action instanceof Delete) { delete(regionName, (Delete) action);
I think if we group the deletes of actionsForRegion, we can utilize the following: public int delete(final byte[] regionName, final List<Delete> deletes)
On Mon, Jun 25, 2012 at 6:56 AM, Ted Yu <[EMAIL PROTECTED]> wrote:
> After testing in the cluster, please open a JIRA and attach result there. > > Thanks for your effort, Anoop. > > > On Mon, Jun 25, 2012 at 6:50 AM, Anoop Sam John <[EMAIL PROTECTED]>wrote: > >> I have made the new miniBatchDelete () and made the >> HTable#delete(List<Delete>) to call this new batch delete. >> Just tested initially with the one node cluster. In that itself I am >> getting a performance boost which is very much promising. >> Only one CF and qualifier. >> 10K total rows delete with a batch of 100 deletes. Only deletes happening >> on the table from one thread. >> With the new way the net time taken is reduced by more than 1/10 >> Will test in a 4 node cluster also. I think it will worth doing this >> change. >> >> -Anoop- >> ________________________________________ >> From: [EMAIL PROTECTED] [[EMAIL PROTECTED]] >> Sent: Wednesday, June 20, 2012 6:31 PM >> To: [EMAIL PROTECTED] >> Cc: [EMAIL PROTECTED] >> Subject: Re: Can there be a doMiniBatchDelete in HRegion? >> >> I think you can issue large number of deletes on the same region and >> observe whether the proposed new method gives us performance boost. >> >> Thanks >> >> >> >> On Jun 20, 2012, at 2:49 AM, Anoop Sam John <[EMAIL PROTECTED]> wrote: >> >> > Hi Devs >> > >> > There is a batch put support in the HRegion level. When >> the put(List<Put>) happens from client, Puts corresponding to one region >> might get grouped together and handled as a batch.[Depending on the >> availability of rowlocks.. code in HRegion#doMiniBatchPut] For this batch >> there will be single write and sync into the HLog file. >> > >> > >> > >> > A similar kind of delete operation, I am not able to see in HRegion. >> The HTable#delete(List<Delete>) groups the Deletes for the same RS and make >> one n/w call only. But within the RS, there will be N number of delete >> calls on the region one by one. This will include N number of HLog write >> and sync. If this also can be grouped can we get better performance for the >> multi row delete. Is there any problem in doing this batch delete? I am >> not sure any JIRA is already present for this. >> > >> > >> > >> > Note : Hregion#mutateRowsWithLock().. we do batch operations of Puts >> and Deletes(also) >> > >> > >> > >> > -Anoop- >> > >
-
Re: Can there be a doMiniBatchDelete in HRegion?
Ted Yu 2012-06-27, 21:13
I created HBASE-6284
Cheers
On Mon, Jun 25, 2012 at 6:56 AM, Ted Yu <[EMAIL PROTECTED]> wrote:
> After testing in the cluster, please open a JIRA and attach result there. > > Thanks for your effort, Anoop. > > > On Mon, Jun 25, 2012 at 6:50 AM, Anoop Sam John <[EMAIL PROTECTED]>wrote: > >> I have made the new miniBatchDelete () and made the >> HTable#delete(List<Delete>) to call this new batch delete. >> Just tested initially with the one node cluster. In that itself I am >> getting a performance boost which is very much promising. >> Only one CF and qualifier. >> 10K total rows delete with a batch of 100 deletes. Only deletes happening >> on the table from one thread. >> With the new way the net time taken is reduced by more than 1/10 >> Will test in a 4 node cluster also. I think it will worth doing this >> change. >> >> -Anoop- >> ________________________________________ >> From: [EMAIL PROTECTED] [[EMAIL PROTECTED]] >> Sent: Wednesday, June 20, 2012 6:31 PM >> To: [EMAIL PROTECTED] >> Cc: [EMAIL PROTECTED] >> Subject: Re: Can there be a doMiniBatchDelete in HRegion? >> >> I think you can issue large number of deletes on the same region and >> observe whether the proposed new method gives us performance boost. >> >> Thanks >> >> >> >> On Jun 20, 2012, at 2:49 AM, Anoop Sam John <[EMAIL PROTECTED]> wrote: >> >> > Hi Devs >> > >> > There is a batch put support in the HRegion level. When >> the put(List<Put>) happens from client, Puts corresponding to one region >> might get grouped together and handled as a batch.[Depending on the >> availability of rowlocks.. code in HRegion#doMiniBatchPut] For this batch >> there will be single write and sync into the HLog file. >> > >> > >> > >> > A similar kind of delete operation, I am not able to see in HRegion. >> The HTable#delete(List<Delete>) groups the Deletes for the same RS and make >> one n/w call only. But within the RS, there will be N number of delete >> calls on the region one by one. This will include N number of HLog write >> and sync. If this also can be grouped can we get better performance for the >> multi row delete. Is there any problem in doing this batch delete? I am >> not sure any JIRA is already present for this. >> > >> > >> > >> > Note : Hregion#mutateRowsWithLock().. we do batch operations of Puts >> and Deletes(also) >> > >> > >> > >> > -Anoop- >> > >
|
|