|
Mohit Anchlia
2011-12-01, 22:22
lars hofhansl
2011-12-01, 22:55
Mohit Anchlia
2011-12-01, 23:03
Stack
2011-12-01, 23:13
lars hofhansl
2011-12-01, 23:37
Mohit Anchlia
2011-12-01, 23:57
lars hofhansl
2011-12-02, 00:23
|
-
Atomicity questionsMohit Anchlia 2011-12-01, 22:22
I have some questions about ACID after reading this page,
http://hbase.apache.org/acid-semantics.html - Atomicity point 5 : row must either be "a=1,b=1,c=1" or "a=2,b=2,c=2" and must not be something like "a=1,b=2,c=1". How is this internally handled in hbase such that above is possible?
-
Re: Atomicity questionslars hofhansl 2011-12-01, 22:55
Hi Mohit,
the best way to study this is to look at MultiVersionConsistencyControl.java (since you are asking how this handled internally). In a nutshell this ensures that read operations don't see writes that are not completed, by (1) defining a thread read point that is rolled forward only after a completed operations and (2) assigning a special timestamp (not the timestamp that you set from the client API) to all KeyValues. -- Lars ----- Original Message ----- From: Mohit Anchlia <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Cc: Sent: Thursday, December 1, 2011 2:22 PM Subject: Atomicity questions I have some questions about ACID after reading this page, http://hbase.apache.org/acid-semantics.html - Atomicity point 5 : row must either be "a=1,b=1,c=1" or "a=2,b=2,c=2" and must not be something like "a=1,b=2,c=1". How is this internally handled in hbase such that above is possible?
-
Re: Atomicity questionsMohit Anchlia 2011-12-01, 23:03
Thanks. I'll try and take a look, but I haven't worked with zookeeper
before. Does it use zookeeper for any of ACID functionality? On Thu, Dec 1, 2011 at 2:55 PM, lars hofhansl <[EMAIL PROTECTED]> wrote: > Hi Mohit, > > the best way to study this is to look at MultiVersionConsistencyControl.java (since you are asking how this handled internally). > > In a nutshell this ensures that read operations don't see writes that are not completed, by (1) defining a thread read point that is rolled forward only after a completed operations and (2) assigning a special timestamp (not the timestamp that you set from the client API) to all KeyValues. > > -- Lars > > > ----- Original Message ----- > From: Mohit Anchlia <[EMAIL PROTECTED]> > To: [EMAIL PROTECTED] > Cc: > Sent: Thursday, December 1, 2011 2:22 PM > Subject: Atomicity questions > > I have some questions about ACID after reading this page, > http://hbase.apache.org/acid-semantics.html > > - Atomicity point 5 : row must either be "a=1,b=1,c=1" or > "a=2,b=2,c=2" and must not be something like "a=1,b=2,c=1". > > How is this internally handled in hbase such that above is possible? > >
-
Re: Atomicity questionsStack 2011-12-01, 23:13
On Thu, Dec 1, 2011 at 3:03 PM, Mohit Anchlia <[EMAIL PROTECTED]> wrote:
> Thanks. I'll try and take a look, but I haven't worked with zookeeper > before. Does it use zookeeper for any of ACID functionality? > No. St.Ack
-
Re: Atomicity questionslars hofhansl 2011-12-01, 23:37
Nope, not using ZK, that would not scale down to the cell level.
You'll probably have to stare at the code in MultiVersionConsistencyControlfor a while (I know I had to). The basic flow of a write operation is this: 1. lock the row 2. persist change to the write ahead log 3. get a "writenumber" from mvcc (this is basically a timestamp) 4. apply change to the memstore (using that write number). 5. advance the readpoint (maximum timestamp of changes that reads will see) -- this is the point where readers see the change 6. unlock the row (7. when memstore is full, flush it to a new disk file, but is done asynchronously, and not really important, although it has some complicated implications when the flush happens while there are readers reading from an old read point) The above is relaxed sometimes for idempotent operations. -- Lars ----- Original Message ----- From: Mohit Anchlia <[EMAIL PROTECTED]> To: [EMAIL PROTECTED]; lars hofhansl <[EMAIL PROTECTED]> Cc: Sent: Thursday, December 1, 2011 3:03 PM Subject: Re: Atomicity questions Thanks. I'll try and take a look, but I haven't worked with zookeeper before. Does it use zookeeper for any of ACID functionality? On Thu, Dec 1, 2011 at 2:55 PM, lars hofhansl <[EMAIL PROTECTED]> wrote: > Hi Mohit, > > the best way to study this is to look at MultiVersionConsistencyControl.java (since you are asking how this handled internally). > > In a nutshell this ensures that read operations don't see writes that are not completed, by (1) defining a thread read point that is rolled forward only after a completed operations and (2) assigning a special timestamp (not the timestamp that you set from the client API) to all KeyValues. > > -- Lars > > > ----- Original Message ----- > From: Mohit Anchlia <[EMAIL PROTECTED]> > To: [EMAIL PROTECTED] > Cc: > Sent: Thursday, December 1, 2011 2:22 PM > Subject: Atomicity questions > > I have some questions about ACID after reading this page, > http://hbase.apache.org/acid-semantics.html > > - Atomicity point 5 : row must either be "a=1,b=1,c=1" or > "a=2,b=2,c=2" and must not be something like "a=1,b=2,c=1". > > How is this internally handled in hbase such that above is possible? > >
-
Re: Atomicity questionsMohit Anchlia 2011-12-01, 23:57
Thanks that makes it more clear. I also looked at mvcc code as you pointed out.
So I am wondering where ZK is used specifically. On Thu, Dec 1, 2011 at 3:37 PM, lars hofhansl <[EMAIL PROTECTED]> wrote: > Nope, not using ZK, that would not scale down to the cell level. > You'll probably have to stare at the code in MultiVersionConsistencyControlfor a while (I know I had to). > > The basic flow of a write operation is this: > 1. lock the row > > 2. persist change to the write ahead log > 3. get a "writenumber" from mvcc (this is basically a timestamp) > > 4. apply change to the memstore (using that write number). > 5. advance the readpoint (maximum timestamp of changes that reads will see) -- this is the point where readers see the change > 6. unlock the row > > (7. when memstore is full, flush it to a new disk file, but is done asynchronously, and not really important, although it has some complicated implications when the flush happens while there are readers reading from an old read point) > > > The above is relaxed sometimes for idempotent operations. > > -- Lars > > > ----- Original Message ----- > From: Mohit Anchlia <[EMAIL PROTECTED]> > To: [EMAIL PROTECTED]; lars hofhansl <[EMAIL PROTECTED]> > Cc: > Sent: Thursday, December 1, 2011 3:03 PM > Subject: Re: Atomicity questions > > Thanks. I'll try and take a look, but I haven't worked with zookeeper > before. Does it use zookeeper for any of ACID functionality? > > On Thu, Dec 1, 2011 at 2:55 PM, lars hofhansl <[EMAIL PROTECTED]> wrote: >> Hi Mohit, >> >> the best way to study this is to look at MultiVersionConsistencyControl.java (since you are asking how this handled internally). >> >> In a nutshell this ensures that read operations don't see writes that are not completed, by (1) defining a thread read point that is rolled forward only after a completed operations and (2) assigning a special timestamp (not the timestamp that you set from the client API) to all KeyValues. >> >> -- Lars >> >> >> ----- Original Message ----- >> From: Mohit Anchlia <[EMAIL PROTECTED]> >> To: [EMAIL PROTECTED] >> Cc: >> Sent: Thursday, December 1, 2011 2:22 PM >> Subject: Atomicity questions >> >> I have some questions about ACID after reading this page, >> http://hbase.apache.org/acid-semantics.html >> >> - Atomicity point 5 : row must either be "a=1,b=1,c=1" or >> "a=2,b=2,c=2" and must not be something like "a=1,b=2,c=1". >> >> How is this internally handled in hbase such that above is possible? >> >> > >
-
Re: Atomicity questionslars hofhansl 2011-12-02, 00:23
ZK is mostly for orchestrating between the master and regionservers.
----- Original Message ----- From: Mohit Anchlia <[EMAIL PROTECTED]> To: [EMAIL PROTECTED]; lars hofhansl <[EMAIL PROTECTED]> Cc: Sent: Thursday, December 1, 2011 3:57 PM Subject: Re: Atomicity questions Thanks that makes it more clear. I also looked at mvcc code as you pointed out. So I am wondering where ZK is used specifically. On Thu, Dec 1, 2011 at 3:37 PM, lars hofhansl <[EMAIL PROTECTED]> wrote: > Nope, not using ZK, that would not scale down to the cell level. > You'll probably have to stare at the code in MultiVersionConsistencyControlfor a while (I know I had to). > > The basic flow of a write operation is this: > 1. lock the row > > 2. persist change to the write ahead log > 3. get a "writenumber" from mvcc (this is basically a timestamp) > > 4. apply change to the memstore (using that write number). > 5. advance the readpoint (maximum timestamp of changes that reads will see) -- this is the point where readers see the change > 6. unlock the row > > (7. when memstore is full, flush it to a new disk file, but is done asynchronously, and not really important, although it has some complicated implications when the flush happens while there are readers reading from an old read point) > > > The above is relaxed sometimes for idempotent operations. > > -- Lars > > > ----- Original Message ----- > From: Mohit Anchlia <[EMAIL PROTECTED]> > To: [EMAIL PROTECTED]; lars hofhansl <[EMAIL PROTECTED]> > Cc: > Sent: Thursday, December 1, 2011 3:03 PM > Subject: Re: Atomicity questions > > Thanks. I'll try and take a look, but I haven't worked with zookeeper > before. Does it use zookeeper for any of ACID functionality? > > On Thu, Dec 1, 2011 at 2:55 PM, lars hofhansl <[EMAIL PROTECTED]> wrote: >> Hi Mohit, >> >> the best way to study this is to look at MultiVersionConsistencyControl.java (since you are asking how this handled internally). >> >> In a nutshell this ensures that read operations don't see writes that are not completed, by (1) defining a thread read point that is rolled forward only after a completed operations and (2) assigning a special timestamp (not the timestamp that you set from the client API) to all KeyValues. >> >> -- Lars >> >> >> ----- Original Message ----- >> From: Mohit Anchlia <[EMAIL PROTECTED]> >> To: [EMAIL PROTECTED] >> Cc: >> Sent: Thursday, December 1, 2011 2:22 PM >> Subject: Atomicity questions >> >> I have some questions about ACID after reading this page, >> http://hbase.apache.org/acid-semantics.html >> >> - Atomicity point 5 : row must either be "a=1,b=1,c=1" or >> "a=2,b=2,c=2" and must not be something like "a=1,b=2,c=1". >> >> How is this internally handled in hbase such that above is possible? >> >> > > |