|
|
-
Is Put() operation a synchronous call on server side?
yun peng 2012-12-06, 13:02
Hi, since on client side HBase can immediately send Put() by turning off setAutoFlush(), I am wondering if Put() in HBase server side is executed in synchronous way? Be a bit more specific, given a Put() that already arrives at HRegion, will it wait (or be blocking) until all put-related operations are done, such as write to WAL and write to memstore, or even flush to disk (though may not on every time). Or it just triggers put-related operations and immediately returns.... Besides, in research of this problem, I found it not very easy to find the code that perform RPC in HBase, for example, how does client-side HTable.put() invoke the server-side HRegion.put().... Can anyone points to me the related code path on this regards? Thanks... regards, Yun
-
Re: Is Put() operation a synchronous call on server side?
daidong 2012-12-06, 13:43
2012/12/6 yun peng <[EMAIL PROTECTED]>
> Hi, since on client side HBase can immediately send Put() by turning off > setAutoFlush(), I am wondering if Put() in HBase server side is executed in > synchronous way? Be a bit more specific, given a Put() that already arrives > at HRegion, will it wait (or be blocking) until all put-related operations > are done, such as write to WAL and write to memstore, or even flush to disk > (though may not on every time). Or it just triggers put-related operations > and immediately returns.... > > > Besides, in research of this problem, I found it not very easy to find the > code that perform RPC in HBase, for example, how does client-side > HTable.put() invoke the server-side HRegion.put().... Can anyone points to > me the related code path on this regards? Thanks... > regards, > Yun >
I think the path should be like this: put() -> flushCommits() -> processBatch() -> processBatchCallback() -> submit() -> ExecutorService.submit()
The callback we submit contains a "connect" and "call" function, which really does the RPC stuff. See ProtobufUtil.java and its "multi" method. About the question how to get the ClientProtocol, you can see HConnectionManager.getProtocol() method.
Hope it helps, and I also want somebody can give more inform about how Protobuf and HBaseRPC work together. :)
-
Re: Is Put() operation a synchronous call on server side?
Harsh J 2012-12-06, 14:06
Hi Yun,
Yes, a single Put call is safely synchronous in nature. A Put is placed on the WAL, added to the MemStore, and then returned back as a success to the client if all went well.
A Put does not directly go to disk, and gets flushed from the MemStore based on regular flushing patterns or based on manual invocations of flush called on its region or its table.
The path to follow is quite simple - A Put goes from a Client (HTable) to a RegionServer (HRegionServer). You've already read the Client areas, so if you read HRegionServer#put(…) method(s), which is the server-end of it, you'll see the Server-RPC end of it.
On Thu, Dec 6, 2012 at 6:32 PM, yun peng <[EMAIL PROTECTED]> wrote: > Hi, since on client side HBase can immediately send Put() by turning off > setAutoFlush(), I am wondering if Put() in HBase server side is executed in > synchronous way? Be a bit more specific, given a Put() that already arrives > at HRegion, will it wait (or be blocking) until all put-related operations > are done, such as write to WAL and write to memstore, or even flush to disk > (though may not on every time). Or it just triggers put-related operations > and immediately returns.... > > > Besides, in research of this problem, I found it not very easy to find the > code that perform RPC in HBase, for example, how does client-side > HTable.put() invoke the server-side HRegion.put().... Can anyone points to > me the related code path on this regards? Thanks... > regards, > Yun
-- Harsh J
-
Re: Is Put() operation a synchronous call on server side?
yun peng 2012-12-06, 18:32
Hi, Dong and Harsh, thanks for your detailed explanations. Based on Dong's answers, I summarise the call path a little bit. HTable.flushCommits()->HConnection.processBatch()->HConnectionManager#HConnectionImplementation.processBatch()->processBatchCallback()->ExecutorService.submit() I think there are some scheduling overheads in ExecutorService.submit().
Dong, I didn't find any code related to protobuf, and my codebase is on HBase0.94.2. Maybe I have not use the most up-to-date version. By the way, I have an other and somewhat related question, but I will post in a seperate thread. Regards, Yun
On Thu, Dec 6, 2012 at 9:06 AM, Harsh J <[EMAIL PROTECTED]> wrote:
> Hi Yun, > > Yes, a single Put call is safely synchronous in nature. A Put is > placed on the WAL, added to the MemStore, and then returned back as a > success to the client if all went well. > > A Put does not directly go to disk, and gets flushed from the MemStore > based on regular flushing patterns or based on manual invocations of > flush called on its region or its table. > > The path to follow is quite simple - A Put goes from a Client (HTable) > to a RegionServer (HRegionServer). You've already read the Client > areas, so if you read HRegionServer#put(…) method(s), which is the > server-end of it, you'll see the Server-RPC end of it. > > On Thu, Dec 6, 2012 at 6:32 PM, yun peng <[EMAIL PROTECTED]> wrote: > > Hi, since on client side HBase can immediately send Put() by turning off > > setAutoFlush(), I am wondering if Put() in HBase server side is executed > in > > synchronous way? Be a bit more specific, given a Put() that already > arrives > > at HRegion, will it wait (or be blocking) until all put-related > operations > > are done, such as write to WAL and write to memstore, or even flush to > disk > > (though may not on every time). Or it just triggers put-related > operations > > and immediately returns.... > > > > > > Besides, in research of this problem, I found it not very easy to find > the > > code that perform RPC in HBase, for example, how does client-side > > HTable.put() invoke the server-side HRegion.put().... Can anyone points > to > > me the related code path on this regards? Thanks... > > regards, > > Yun > > > > -- > Harsh J >
-
Re: Is Put() operation a synchronous call on server side?
Jimmy Xiang 2012-12-06, 18:45
For a single put, yes, there is some overhead. There is a jira to remove the overhead: HBASE-6739, which is still open.
Thanks, Jimmy
On Thu, Dec 6, 2012 at 10:32 AM, yun peng <[EMAIL PROTECTED]> wrote: > Hi, Dong and Harsh, thanks for your detailed explanations. Based on Dong's > answers, I summarise the call path a little bit. > HTable.flushCommits()->HConnection.processBatch()->HConnectionManager#HConnectionImplementation.processBatch()->processBatchCallback()->ExecutorService.submit() > I think there are some scheduling overheads in ExecutorService.submit(). > > Dong, I didn't find any code related to protobuf, and my codebase is on > HBase0.94.2. Maybe I have not use the most up-to-date version. By the way, > I have an other and somewhat related question, but I will post in a > seperate thread. > Regards, > Yun > > On Thu, Dec 6, 2012 at 9:06 AM, Harsh J <[EMAIL PROTECTED]> wrote: > >> Hi Yun, >> >> Yes, a single Put call is safely synchronous in nature. A Put is >> placed on the WAL, added to the MemStore, and then returned back as a >> success to the client if all went well. >> >> A Put does not directly go to disk, and gets flushed from the MemStore >> based on regular flushing patterns or based on manual invocations of >> flush called on its region or its table. >> >> The path to follow is quite simple - A Put goes from a Client (HTable) >> to a RegionServer (HRegionServer). You've already read the Client >> areas, so if you read HRegionServer#put(…) method(s), which is the >> server-end of it, you'll see the Server-RPC end of it. >> >> On Thu, Dec 6, 2012 at 6:32 PM, yun peng <[EMAIL PROTECTED]> wrote: >> > Hi, since on client side HBase can immediately send Put() by turning off >> > setAutoFlush(), I am wondering if Put() in HBase server side is executed >> in >> > synchronous way? Be a bit more specific, given a Put() that already >> arrives >> > at HRegion, will it wait (or be blocking) until all put-related >> operations >> > are done, such as write to WAL and write to memstore, or even flush to >> disk >> > (though may not on every time). Or it just triggers put-related >> operations >> > and immediately returns.... >> > >> > >> > Besides, in research of this problem, I found it not very easy to find >> the >> > code that perform RPC in HBase, for example, how does client-side >> > HTable.put() invoke the server-side HRegion.put().... Can anyone points >> to >> > me the related code path on this regards? Thanks... >> > regards, >> > Yun >> >> >> >> -- >> Harsh J >>
|
|