|
Himanshu Vashishtha
2012-04-12, 19:11
Jean-Daniel Cryans
2012-04-12, 21:31
lars hofhansl
2012-04-12, 21:50
Jesse Yates
2012-04-12, 21:56
lars hofhansl
2012-04-12, 22:12
lars hofhansl
2012-04-12, 22:37
Himanshu Vashishtha
2012-04-12, 23:17
Andrew Purtell
2012-04-13, 00:50
Lars George
2012-04-13, 06:13
lars hofhansl
2012-04-13, 23:54
Himanshu Vashishtha
2012-04-14, 00:18
lars hofhansl
2012-04-14, 00:31
Lars George
2012-04-14, 10:38
lars hofhansl
2012-04-15, 21:01
|
-
HBase Replication use casesHimanshu Vashishtha 2012-04-12, 19:11
Hello All,
I have been doing testing on the HBase replication (0.90.4, and 0.92 variants). Here are some of the findings: a) 0.90+ is not that great in handling out znode changes; in an ongoing replication, if I delete a peer and a region server goes to the znode to update the log status, the region server aborts itself when it sees a missing znode. Recoverable Zookeeper seems to have fix this in 0.92+? 0.92 has lot of new features (start/stop handle, master master, cyclic). But there are corner cases with the start/stop switches. i) A log is en-queue when the replication state is set to true. When we start the cluster, it is true and the starting region server takes the new log into the queue. If I do a stop_replication, and there is a log roll, and then I do a start_replication, the current log will not be replicated, as it has missed the opportunity of being added to the queue. ii) If I _start_ a region server when the replication state is set to false, its log will not be added to the queue. Now, if I do a start_replication, its log will not be replicated. iii) Removing a peer doesn't result in master region server abort, but in case of zk is down and there is a log roll, it will abort. Not a serious one as zk is down so the cluster is not healthy anyway. I was looking for jiras (including 2611), and stumbled upon 2223. I don't think there is any thing like time based partition behavior (as mentioned in the jira description). Though. the patch has lot of other nice things which indeed are in existing code. Please correct me if I miss anything. Having said that, I wonder about other folks out there use it. Their experience, common issues (minor + major) they come across. I did find a ppt by Jean Daniel at oscon mentioning about using it in SU production. I plan to file jiras for the above ones and will start digging in. Look forward for your responses. Thanks, Himanshu
-
Re: HBase Replication use casesJean-Daniel Cryans 2012-04-12, 21:31
On Thu, Apr 12, 2012 at 12:11 PM, Himanshu Vashishtha
<[EMAIL PROTECTED]> wrote: > Hello All, > > I have been doing testing on the HBase replication (0.90.4, and 0.92 variants). > > Here are some of the findings: > > a) 0.90+ is not that great in handling out znode changes; in an > ongoing replication, if I delete a peer and a region server goes to > the znode to update the log status, the region server aborts itself > when it sees a missing znode. I don't think I've encountered that, and I've deleted peers a few times. Log? > > Recoverable Zookeeper seems to have fix this in 0.92+? > > 0.92 has lot of new features (start/stop handle, master master, cyclic). > > But there are corner cases with the start/stop switches. > i) A log is en-queue when the replication state is set to true. When we > start the cluster, it is true and the starting region server takes the > new log into the queue. If I do a stop_replication, and there is a log > roll, and then I do a start_replication, the current log will not be > replicated, as it has missed the opportunity of being added to the queue. stop_replication is a kill switch, it should only be used in that context eg you want to kill replication. It's not supposed to do something reliable outside of killing replication. > > ii) If I _start_ a region server when the replication state is set to > false, its log will not be added to the queue. Now, if I do a > start_replication, its log will not be replicated. See above. > > iii) Removing a peer doesn't result in master region server abort, but > in case of zk is down and there is a log roll, it will abort. Not a > serious one as zk is down so the cluster is not healthy anyway. If zk is down, HBase should be down. > > I was looking for jiras (including 2611), and stumbled upon 2223. I > don't think there is any thing like time based partition behavior (as > mentioned in the jira description). Though. the patch has lot of other > nice things which indeed are in existing code. Please correct me if I > miss anything. The jira was about being able to handle a network partition that lasted more than 10 minutes, originally the replication code was buffering in memory so eventually it was OOMEing. > > Having said that, I wonder about other folks out there use it. > Their experience, common issues (minor + major) they come across. > I did find a ppt by Jean Daniel at oscon mentioning about using it in > SU production. In 0.90 if there was a ZK hiccup it could kill a bunch of region servers when they wanted to talk to ZK. In 0.92 RecoverableZookeeper handles it. The master is very sloppy when keeping track of the logs that can be deleted, it was built like that so that it wouldn't hit ZK too hard, but it seems to be retaining too many logs. J-D
-
Re: HBase Replication use caseslars hofhansl 2012-04-12, 21:50
Thanks Himanshu,
we're planning to use Replication for cross DC replication for DR (and we added a bunch of stuff and fixed bugs in replication). We'll have it always on (and only use stop/start_peer, which is new in 0.94+ to temporarily stop replication, rather than stop/start_replication) HBASE-2611 is a problem. We did not have time recently to work on this. i) and ii) can be worked around by forcing a log roll on all region servers after replication was enabled. Replication would be considered started after the logs were rolled... But that is quite annoying. Is iii) still a problem in 0.92+? I thought we fixed that together with a). -- Lars ________________________________ From: Himanshu Vashishtha <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Sent: Thursday, April 12, 2012 12:11 PM Subject: HBase Replication use cases Hello All, I have been doing testing on the HBase replication (0.90.4, and 0.92 variants). Here are some of the findings: a) 0.90+ is not that great in handling out znode changes; in an ongoing replication, if I delete a peer and a region server goes to the znode to update the log status, the region server aborts itself when it sees a missing znode. Recoverable Zookeeper seems to have fix this in 0.92+? 0.92 has lot of new features (start/stop handle, master master, cyclic). But there are corner cases with the start/stop switches. i) A log is en-queue when the replication state is set to true. When we start the cluster, it is true and the starting region server takes the new log into the queue. If I do a stop_replication, and there is a log roll, and then I do a start_replication, the current log will not be replicated, as it has missed the opportunity of being added to the queue. ii) If I _start_ a region server when the replication state is set to false, its log will not be added to the queue. Now, if I do a start_replication, its log will not be replicated. iii) Removing a peer doesn't result in master region server abort, but in case of zk is down and there is a log roll, it will abort. Not a serious one as zk is down so the cluster is not healthy anyway. I was looking for jiras (including 2611), and stumbled upon 2223. I don't think there is any thing like time based partition behavior (as mentioned in the jira description). Though. the patch has lot of other nice things which indeed are in existing code. Please correct me if I miss anything. Having said that, I wonder about other folks out there use it. Their experience, common issues (minor + major) they come across. I did find a ppt by Jean Daniel at oscon mentioning about using it in SU production. I plan to file jiras for the above ones and will start digging in. Look forward for your responses. Thanks, Himanshu
-
Re: HBase Replication use casesJesse Yates 2012-04-12, 21:56
On Apr 12, 2012, at 2:50 PM, lars hofhansl <[EMAIL PROTECTED]> wrote: > Thanks Himanshu, > > we're planning to use Replication for cross DC replication for DR (and we added a bunch of stuff and fixed bugs in replication). > > > We'll have it always on (and only use stop/start_peer, which is new in 0.94+ to temporarily stop replication, rather than stop/start_replication) > HBASE-2611 is a problem. We did not have time recently to work on this. > > i) and ii) can be worked around by forcing a log roll on all region servers after replication was enabled. Replication would be considered started after the logs were > rolled... But that is quite annoying. > Should we consider adding this as part of the replication code proper? Is there a smarter way to go about it? - Jesse > Is iii) still a problem in 0.92+? I thought we fixed that together with a). > > -- Lars > > ________________________________ > From: Himanshu Vashishtha <[EMAIL PROTECTED]> > To: [EMAIL PROTECTED] > Sent: Thursday, April 12, 2012 12:11 PM > Subject: HBase Replication use cases > > Hello All, > > I have been doing testing on the HBase replication (0.90.4, and 0.92 variants). > > Here are some of the findings: > > a) 0.90+ is not that great in handling out znode changes; in an > ongoing replication, if I delete a peer and a region server goes to > the znode to update the log status, the region server aborts itself > when it sees a missing znode. > > Recoverable Zookeeper seems to have fix this in 0.92+? > > 0.92 has lot of new features (start/stop handle, master master, cyclic). > > But there are corner cases with the start/stop switches. > i) A log is en-queue when the replication state is set to true. When we > start the cluster, it is true and the starting region server takes the > new log into the queue. If I do a stop_replication, and there is a log > roll, and then I do a start_replication, the current log will not be > replicated, as it has missed the opportunity of being added to the queue. > > ii) If I _start_ a region server when the replication state is set to > false, its log will not be added to the queue. Now, if I do a > start_replication, its log will not be replicated. > > iii) Removing a peer doesn't result in master region server abort, but > in case of zk is down and there is a log roll, it will abort. Not a > serious one as zk is down so the cluster is not healthy anyway. > > I was looking for jiras (including 2611), and stumbled upon 2223. I > don't think there is any thing like time based partition behavior (as > mentioned in the jira description). Though. the patch has lot of other > nice things which indeed are in existing code. Please correct me if I > miss anything. > > Having said that, I wonder about other folks out there use it. > Their experience, common issues (minor + major) they come across. > I did find a ppt by Jean Daniel at oscon mentioning about using it in > SU production. > > I plan to file jiras for the above ones and will start digging in. > > Look forward for your responses. > > Thanks, > Himanshu
-
Re: HBase Replication use caseslars hofhansl 2012-04-12, 22:12
I think it's like J-D said. stop_replication is a kill switch.
In 0.94+ we have start/stop_peer which suspends replication, but still keeps track of the logs to replicate. It would complicate the code a lot (IMHO) to start replicating from partial logs or to roll each and every log and then consider replication started only after the last log was rolled. ----- Original Message ----- From: Jesse Yates <[EMAIL PROTECTED]> To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> Cc: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> Sent: Thursday, April 12, 2012 2:56 PM Subject: Re: HBase Replication use cases On Apr 12, 2012, at 2:50 PM, lars hofhansl <[EMAIL PROTECTED]> wrote: > Thanks Himanshu, > > we're planning to use Replication for cross DC replication for DR (and we added a bunch of stuff and fixed bugs in replication). > > > We'll have it always on (and only use stop/start_peer, which is new in 0.94+ to temporarily stop replication, rather than stop/start_replication) > HBASE-2611 is a problem. We did not have time recently to work on this. > > i) and ii) can be worked around by forcing a log roll on all region servers after replication was enabled. Replication would be considered started after the logs were > rolled... But that is quite annoying. > Should we consider adding this as part of the replication code proper? Is there a smarter way to go about it? - Jesse > Is iii) still a problem in 0.92+? I thought we fixed that together with a). > > -- Lars > > ________________________________ > From: Himanshu Vashishtha <[EMAIL PROTECTED]> > To: [EMAIL PROTECTED] > Sent: Thursday, April 12, 2012 12:11 PM > Subject: HBase Replication use cases > > Hello All, > > I have been doing testing on the HBase replication (0.90.4, and 0.92 variants). > > Here are some of the findings: > > a) 0.90+ is not that great in handling out znode changes; in an > ongoing replication, if I delete a peer and a region server goes to > the znode to update the log status, the region server aborts itself > when it sees a missing znode. > > Recoverable Zookeeper seems to have fix this in 0.92+? > > 0.92 has lot of new features (start/stop handle, master master, cyclic). > > But there are corner cases with the start/stop switches. > i) A log is en-queue when the replication state is set to true. When we > start the cluster, it is true and the starting region server takes the > new log into the queue. If I do a stop_replication, and there is a log > roll, and then I do a start_replication, the current log will not be > replicated, as it has missed the opportunity of being added to the queue. > > ii) If I _start_ a region server when the replication state is set to > false, its log will not be added to the queue. Now, if I do a > start_replication, its log will not be replicated. > > iii) Removing a peer doesn't result in master region server abort, but > in case of zk is down and there is a log roll, it will abort. Not a > serious one as zk is down so the cluster is not healthy anyway. > > I was looking for jiras (including 2611), and stumbled upon 2223. I > don't think there is any thing like time based partition behavior (as > mentioned in the jira description). Though. the patch has lot of other > nice things which indeed are in existing code. Please correct me if I > miss anything. > > Having said that, I wonder about other folks out there use it. > Their experience, common issues (minor + major) they come across. > I did find a ppt by Jean Daniel at oscon mentioning about using it in > SU production. > > I plan to file jiras for the above ones and will start digging in. > > Look forward for your responses. > > Thanks, > Himanshu
-
Re: HBase Replication use caseslars hofhansl 2012-04-12, 22:37
Himanshu,
please keep digging, though. This is will mission critical for us, and we'll be testing this heavily. If you find anything strange, by all means file a jira, squashing bugs here is critical. -- Lars ----- Original Message ----- From: lars hofhansl <[EMAIL PROTECTED]> To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> Cc: Sent: Thursday, April 12, 2012 3:12 PM Subject: Re: HBase Replication use cases I think it's like J-D said. stop_replication is a kill switch. In 0.94+ we have start/stop_peer which suspends replication, but still keeps track of the logs to replicate. It would complicate the code a lot (IMHO) to start replicating from partial logs or to roll each and every log and then consider replication started only after the last log was rolled. ----- Original Message ----- From: Jesse Yates <[EMAIL PROTECTED]> To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> Cc: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> Sent: Thursday, April 12, 2012 2:56 PM Subject: Re: HBase Replication use cases On Apr 12, 2012, at 2:50 PM, lars hofhansl <[EMAIL PROTECTED]> wrote: > Thanks Himanshu, > > we're planning to use Replication for cross DC replication for DR (and we added a bunch of stuff and fixed bugs in replication). > > > We'll have it always on (and only use stop/start_peer, which is new in 0.94+ to temporarily stop replication, rather than stop/start_replication) > HBASE-2611 is a problem. We did not have time recently to work on this. > > i) and ii) can be worked around by forcing a log roll on all region servers after replication was enabled. Replication would be considered started after the logs were > rolled... But that is quite annoying. > Should we consider adding this as part of the replication code proper? Is there a smarter way to go about it? - Jesse > Is iii) still a problem in 0.92+? I thought we fixed that together with a). > > -- Lars > > ________________________________ > From: Himanshu Vashishtha <[EMAIL PROTECTED]> > To: [EMAIL PROTECTED] > Sent: Thursday, April 12, 2012 12:11 PM > Subject: HBase Replication use cases > > Hello All, > > I have been doing testing on the HBase replication (0.90.4, and 0.92 variants). > > Here are some of the findings: > > a) 0.90+ is not that great in handling out znode changes; in an > ongoing replication, if I delete a peer and a region server goes to > the znode to update the log status, the region server aborts itself > when it sees a missing znode. > > Recoverable Zookeeper seems to have fix this in 0.92+? > > 0.92 has lot of new features (start/stop handle, master master, cyclic). > > But there are corner cases with the start/stop switches. > i) A log is en-queue when the replication state is set to true. When we > start the cluster, it is true and the starting region server takes the > new log into the queue. If I do a stop_replication, and there is a log > roll, and then I do a start_replication, the current log will not be > replicated, as it has missed the opportunity of being added to the queue. > > ii) If I _start_ a region server when the replication state is set to > false, its log will not be added to the queue. Now, if I do a > start_replication, its log will not be replicated. > > iii) Removing a peer doesn't result in master region server abort, but > in case of zk is down and there is a log roll, it will abort. Not a > serious one as zk is down so the cluster is not healthy anyway. > > I was looking for jiras (including 2611), and stumbled upon 2223. I > don't think there is any thing like time based partition behavior (as > mentioned in the jira description). Though. the patch has lot of other > nice things which indeed are in existing code. Please correct me if I > miss anything. > > Having said that, I wonder about other folks out there use it. > Their experience, common issues (minor + major) they come across. > I did find a ppt by Jean Daniel at oscon mentioning about using it in > SU production.
-
Re: HBase Replication use casesHimanshu Vashishtha 2012-04-12, 23:17
Thank you all for replying.
@JD you asked about logs for 0.90. I ran it 2 weeks back, and don't have logs atm; but you also echoed the same thing that when RS talks to ZK and there is a problem, they abort themselves. It seems similar to me. @Lars/@Jessy: Yes, rolling log on invoking start/stop replication is fairly disruptive. I agree that enable/disable a particular peer is more appropriate as we keep on enqueing the new logs at the ReplicationSource. But there is no limit on the number of logs it should keep (a priorityBlockingQueue has Integer.Max capacity) atm. For iii), in case of a log rolling, ReplicationSourceManager tries to add the new log at the Znodes of the peers, and throws an IOException when it fails. In case ZK is down, HBase is automatically down (though RS keeps on waiting, for the Master as the it aborts itself, and for the ZK quorum); but it can still serve the reads/write to existing clients, with no splits obviously. Not a serious issue, though. Yeah, start/stop_replication begets interesting scenarios, which may lead to incomplete replication. Should be used in extreme conditions. Still looking at it... Thanks, Himanshu On Thu, Apr 12, 2012 at 3:37 PM, lars hofhansl <[EMAIL PROTECTED]> wrote: > Himanshu, > > please keep digging, though. This is will mission critical for us, and we'll be testing this heavily. > If you find anything strange, by all means file a jira, squashing bugs here is critical. > > > -- Lars > > > ----- Original Message ----- > From: lars hofhansl <[EMAIL PROTECTED]> > To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> > Cc: > Sent: Thursday, April 12, 2012 3:12 PM > Subject: Re: HBase Replication use cases > > I think it's like J-D said. stop_replication is a kill switch. > In 0.94+ we have start/stop_peer which suspends replication, but still keeps track of the logs to replicate. > > > It would complicate the code a lot (IMHO) to start replicating from partial logs or to roll each and every log and then consider replication started only after the last log was rolled. > > > ----- Original Message ----- > From: Jesse Yates <[EMAIL PROTECTED]> > To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> > Cc: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> > Sent: Thursday, April 12, 2012 2:56 PM > Subject: Re: HBase Replication use cases > > > > On Apr 12, 2012, at 2:50 PM, lars hofhansl <[EMAIL PROTECTED]> wrote: > >> Thanks Himanshu, >> >> we're planning to use Replication for cross DC replication for DR (and we added a bunch of stuff and fixed bugs in replication). >> >> >> We'll have it always on (and only use stop/start_peer, which is new in 0.94+ to temporarily stop replication, rather than stop/start_replication) >> HBASE-2611 is a problem. We did not have time recently to work on this. >> >> i) and ii) can be worked around by forcing a log roll on all region servers after replication was enabled. Replication would be considered started after the logs were >> rolled... But that is quite annoying. >> > > Should we consider adding this as part of the replication code proper? Is there a smarter way to go about it? > > - Jesse >> Is iii) still a problem in 0.92+? I thought we fixed that together with a). >> >> -- Lars >> >> ________________________________ >> From: Himanshu Vashishtha <[EMAIL PROTECTED]> >> To: [EMAIL PROTECTED] >> Sent: Thursday, April 12, 2012 12:11 PM >> Subject: HBase Replication use cases >> >> Hello All, >> >> I have been doing testing on the HBase replication (0.90.4, and 0.92 variants). >> >> Here are some of the findings: >> >> a) 0.90+ is not that great in handling out znode changes; in an >> ongoing replication, if I delete a peer and a region server goes to >> the znode to update the log status, the region server aborts itself >> when it sees a missing znode. >> >> Recoverable Zookeeper seems to have fix this in 0.92+? >> >> 0.92 has lot of new features (start/stop handle, master master, cyclic). >> >> But there are corner cases with the start/stop switches.
-
Re: HBase Replication use casesAndrew Purtell 2012-04-13, 00:50
I have a new use case that will involve replication intercontinentally, between two EC2 regions. Using 0.94. It will only be a proof of concept but might shake out something. I will also have an economic incentive to minimize transfer.
Best regards, - Andy On Apr 13, 2012, at 5:50 AM, lars hofhansl <[EMAIL PROTECTED]> wrote: > Thanks Himanshu, > > we're planning to use Replication for cross DC replication for DR (and we added a bunch of stuff and fixed bugs in replication). > > > We'll have it always on (and only use stop/start_peer, which is new in 0.94+ to temporarily stop replication, rather than stop/start_replication) > HBASE-2611 is a problem. We did not have time recently to work on this. > > i) and ii) can be worked around by forcing a log roll on all region servers after replication was enabled. Replication would be considered started after the logs were > rolled... But that is quite annoying. > > Is iii) still a problem in 0.92+? I thought we fixed that together with a). > > -- Lars > > ________________________________ > From: Himanshu Vashishtha <[EMAIL PROTECTED]> > To: [EMAIL PROTECTED] > Sent: Thursday, April 12, 2012 12:11 PM > Subject: HBase Replication use cases > > Hello All, > > I have been doing testing on the HBase replication (0.90.4, and 0.92 variants). > > Here are some of the findings: > > a) 0.90+ is not that great in handling out znode changes; in an > ongoing replication, if I delete a peer and a region server goes to > the znode to update the log status, the region server aborts itself > when it sees a missing znode. > > Recoverable Zookeeper seems to have fix this in 0.92+? > > 0.92 has lot of new features (start/stop handle, master master, cyclic). > > But there are corner cases with the start/stop switches. > i) A log is en-queue when the replication state is set to true. When we > start the cluster, it is true and the starting region server takes the > new log into the queue. If I do a stop_replication, and there is a log > roll, and then I do a start_replication, the current log will not be > replicated, as it has missed the opportunity of being added to the queue. > > ii) If I _start_ a region server when the replication state is set to > false, its log will not be added to the queue. Now, if I do a > start_replication, its log will not be replicated. > > iii) Removing a peer doesn't result in master region server abort, but > in case of zk is down and there is a log roll, it will abort. Not a > serious one as zk is down so the cluster is not healthy anyway. > > I was looking for jiras (including 2611), and stumbled upon 2223. I > don't think there is any thing like time based partition behavior (as > mentioned in the jira description). Though. the patch has lot of other > nice things which indeed are in existing code. Please correct me if I > miss anything. > > Having said that, I wonder about other folks out there use it. > Their experience, common issues (minor + major) they come across. > I did find a ppt by Jean Daniel at oscon mentioning about using it in > SU production. > > I plan to file jiras for the above ones and will start digging in. > > Look forward for your responses. > > Thanks, > Himanshu
-
Re: HBase Replication use casesLars George 2012-04-13, 06:13
Hi Lars,
I am really curious how you will handle the possible (or say likely) inconsistencies between regions of the same table in case of a DR situation. This seems to be solely applications layer logic but on the other hand a lot of people will need something here. So the question is, could this be added to the code? The idea is, could we hint to the replication what schema we are using and it can therefore handle shipping the logs somewhat "transactional" on the receiving end? For example, it could record sequence IDs or even timestamps and when the originating cluster fails there is a mechanism on the receiving end that deletes all inconsistent changes, bringing it back to a well known checkpoint. The replication does ship the WAL edits so, this might be all that is needed, and some ZooKeeper magic there to synchronize the checkpoint across the region servers? Maybe I am seeing this wrong here, but how else would you recover in the case of a DR situation? Cheers, Lars On Apr 12, 2012, at 11:50 PM, lars hofhansl wrote: > Thanks Himanshu, > > we're planning to use Replication for cross DC replication for DR (and we added a bunch of stuff and fixed bugs in replication). > > > We'll have it always on (and only use stop/start_peer, which is new in 0.94+ to temporarily stop replication, rather than stop/start_replication) > HBASE-2611 is a problem. We did not have time recently to work on this. > > i) and ii) can be worked around by forcing a log roll on all region servers after replication was enabled. Replication would be considered started after the logs were > rolled... But that is quite annoying. > > Is iii) still a problem in 0.92+? I thought we fixed that together with a). > > -- Lars > > ________________________________ > From: Himanshu Vashishtha <[EMAIL PROTECTED]> > To: [EMAIL PROTECTED] > Sent: Thursday, April 12, 2012 12:11 PM > Subject: HBase Replication use cases > > Hello All, > > I have been doing testing on the HBase replication (0.90.4, and 0.92 variants). > > Here are some of the findings: > > a) 0.90+ is not that great in handling out znode changes; in an > ongoing replication, if I delete a peer and a region server goes to > the znode to update the log status, the region server aborts itself > when it sees a missing znode. > > Recoverable Zookeeper seems to have fix this in 0.92+? > > 0.92 has lot of new features (start/stop handle, master master, cyclic). > > But there are corner cases with the start/stop switches. > i) A log is en-queue when the replication state is set to true. When we > start the cluster, it is true and the starting region server takes the > new log into the queue. If I do a stop_replication, and there is a log > roll, and then I do a start_replication, the current log will not be > replicated, as it has missed the opportunity of being added to the queue. > > ii) If I _start_ a region server when the replication state is set to > false, its log will not be added to the queue. Now, if I do a > start_replication, its log will not be replicated. > > iii) Removing a peer doesn't result in master region server abort, but > in case of zk is down and there is a log roll, it will abort. Not a > serious one as zk is down so the cluster is not healthy anyway. > > I was looking for jiras (including 2611), and stumbled upon 2223. I > don't think there is any thing like time based partition behavior (as > mentioned in the jira description). Though. the patch has lot of other > nice things which indeed are in existing code. Please correct me if I > miss anything. > > Having said that, I wonder about other folks out there use it. > Their experience, common issues (minor + major) they come across. > I did find a ppt by Jean Daniel at oscon mentioning about using it in > SU production. > > I plan to file jiras for the above ones and will start digging in. > > Look forward for your responses. > > Thanks, > Himanshu
-
Re: HBase Replication use caseslars hofhansl 2012-04-13, 23:54
Hey Lars,
in a DR scenario (i.e. a DC falls into the ocean) we SLAs that allow for a certain amount of data loss. The main concern here would be that "rows" could be in a state that does not correspond to the state at the end of any of the row transactions in the source system, right? Or are you referring to even cross table consistency? -- Lars ----- Original Message ----- From: Lars George <[EMAIL PROTECTED]> To: [EMAIL PROTECTED]; lars hofhansl <[EMAIL PROTECTED]> Cc: Sent: Thursday, April 12, 2012 11:13 PM Subject: Re: HBase Replication use cases Hi Lars, I am really curious how you will handle the possible (or say likely) inconsistencies between regions of the same table in case of a DR situation. This seems to be solely applications layer logic but on the other hand a lot of people will need something here. So the question is, could this be added to the code? The idea is, could we hint to the replication what schema we are using and it can therefore handle shipping the logs somewhat "transactional" on the receiving end? For example, it could record sequence IDs or even timestamps and when the originating cluster fails there is a mechanism on the receiving end that deletes all inconsistent changes, bringing it back to a well known checkpoint. The replication does ship the WAL edits so, this might be all that is needed, and some ZooKeeper magic there to synchronize the checkpoint across the region servers? Maybe I am seeing this wrong here, but how else would you recover in the case of a DR situation? Cheers, Lars On Apr 12, 2012, at 11:50 PM, lars hofhansl wrote: > Thanks Himanshu, > > we're planning to use Replication for cross DC replication for DR (and we added a bunch of stuff and fixed bugs in replication). > > > We'll have it always on (and only use stop/start_peer, which is new in 0.94+ to temporarily stop replication, rather than stop/start_replication) > HBASE-2611 is a problem. We did not have time recently to work on this. > > i) and ii) can be worked around by forcing a log roll on all region servers after replication was enabled. Replication would be considered started after the logs were > rolled... But that is quite annoying. > > Is iii) still a problem in 0.92+? I thought we fixed that together with a). > > -- Lars > > ________________________________ > From: Himanshu Vashishtha <[EMAIL PROTECTED]> > To: [EMAIL PROTECTED] > Sent: Thursday, April 12, 2012 12:11 PM > Subject: HBase Replication use cases > > Hello All, > > I have been doing testing on the HBase replication (0.90.4, and 0.92 variants). > > Here are some of the findings: > > a) 0.90+ is not that great in handling out znode changes; in an > ongoing replication, if I delete a peer and a region server goes to > the znode to update the log status, the region server aborts itself > when it sees a missing znode. > > Recoverable Zookeeper seems to have fix this in 0.92+? > > 0.92 has lot of new features (start/stop handle, master master, cyclic). > > But there are corner cases with the start/stop switches. > i) A log is en-queue when the replication state is set to true. When we > start the cluster, it is true and the starting region server takes the > new log into the queue. If I do a stop_replication, and there is a log > roll, and then I do a start_replication, the current log will not be > replicated, as it has missed the opportunity of being added to the queue. > > ii) If I _start_ a region server when the replication state is set to > false, its log will not be added to the queue. Now, if I do a > start_replication, its log will not be replicated. > > iii) Removing a peer doesn't result in master region server abort, but > in case of zk is down and there is a log roll, it will abort. Not a > serious one as zk is down so the cluster is not healthy anyway. > > I was looking for jiras (including 2611), and stumbled upon 2223. I > don't think there is any thing like time based partition behavior (as
-
Re: HBase Replication use casesHimanshu Vashishtha 2012-04-14, 00:18
@Lars H:
WalEdits are per transaction (aka per row). And we do ship waledits, so at the slave, it will be end of some transaction the master have seen? Idea is that the atomicity is a waledit. Can we have such a scenario where end state does not correspond to the state at the end of any of the row transactions in the master? It will be good to know. Thanks. On Fri, Apr 13, 2012 at 4:54 PM, lars hofhansl <[EMAIL PROTECTED]> wrote: > Hey Lars, > > in a DR scenario (i.e. a DC falls into the ocean) we SLAs that allow for a certain amount of data loss. > The main concern here would be that "rows" could be in a state that does not correspond to the state at the end of any of the row transactions in the source system, right? > > Or are you referring to even cross table consistency? > > > -- Lars > > > ----- Original Message ----- > From: Lars George <[EMAIL PROTECTED]> > To: [EMAIL PROTECTED]; lars hofhansl <[EMAIL PROTECTED]> > Cc: > Sent: Thursday, April 12, 2012 11:13 PM > Subject: Re: HBase Replication use cases > > Hi Lars, > > I am really curious how you will handle the possible (or say likely) inconsistencies between regions of the same table in case of a DR situation. This seems to be solely applications layer logic but on the other hand a lot of people will need something here. So the question is, could this be added to the code? The idea is, could we hint to the replication what schema we are using and it can therefore handle shipping the logs somewhat "transactional" on the receiving end? For example, it could record sequence IDs or even timestamps and when the originating cluster fails there is a mechanism on the receiving end that deletes all inconsistent changes, bringing it back to a well known checkpoint. The replication does ship the WAL edits so, this might be all that is needed, and some ZooKeeper magic there to synchronize the checkpoint across the region servers? > > Maybe I am seeing this wrong here, but how else would you recover in the case of a DR situation? > > Cheers, > Lars > > On Apr 12, 2012, at 11:50 PM, lars hofhansl wrote: > >> Thanks Himanshu, >> >> we're planning to use Replication for cross DC replication for DR (and we added a bunch of stuff and fixed bugs in replication). >> >> >> We'll have it always on (and only use stop/start_peer, which is new in 0.94+ to temporarily stop replication, rather than stop/start_replication) >> HBASE-2611 is a problem. We did not have time recently to work on this. >> >> i) and ii) can be worked around by forcing a log roll on all region servers after replication was enabled. Replication would be considered started after the logs were >> rolled... But that is quite annoying. >> >> Is iii) still a problem in 0.92+? I thought we fixed that together with a). >> >> -- Lars >> >> ________________________________ >> From: Himanshu Vashishtha <[EMAIL PROTECTED]> >> To: [EMAIL PROTECTED] >> Sent: Thursday, April 12, 2012 12:11 PM >> Subject: HBase Replication use cases >> >> Hello All, >> >> I have been doing testing on the HBase replication (0.90.4, and 0.92 variants). >> >> Here are some of the findings: >> >> a) 0.90+ is not that great in handling out znode changes; in an >> ongoing replication, if I delete a peer and a region server goes to >> the znode to update the log status, the region server aborts itself >> when it sees a missing znode. >> >> Recoverable Zookeeper seems to have fix this in 0.92+? >> >> 0.92 has lot of new features (start/stop handle, master master, cyclic). >> >> But there are corner cases with the start/stop switches. >> i) A log is en-queue when the replication state is set to true. When we >> start the cluster, it is true and the starting region server takes the >> new log into the queue. If I do a stop_replication, and there is a log >> roll, and then I do a start_replication, the current log will not be >> replicated, as it has missed the opportunity of being added to the queue. >> >> ii) If I _start_ a region server when the replication state is set to
-
Re: HBase Replication use caseslars hofhansl 2012-04-14, 00:31
Ah yes. Good point. You're absolutely right.
----- Original Message ----- From: Himanshu Vashishtha <[EMAIL PROTECTED]> To: [EMAIL PROTECTED]; lars hofhansl <[EMAIL PROTECTED]> Cc: Sent: Friday, April 13, 2012 5:18 PM Subject: Re: HBase Replication use cases @Lars H: WalEdits are per transaction (aka per row). And we do ship waledits, so at the slave, it will be end of some transaction the master have seen? Idea is that the atomicity is a waledit. Can we have such a scenario where end state does not correspond to the state at the end of any of the row transactions in the master? It will be good to know. Thanks. On Fri, Apr 13, 2012 at 4:54 PM, lars hofhansl <[EMAIL PROTECTED]> wrote: > Hey Lars, > > in a DR scenario (i.e. a DC falls into the ocean) we SLAs that allow for a certain amount of data loss. > The main concern here would be that "rows" could be in a state that does not correspond to the state at the end of any of the row transactions in the source system, right? > > Or are you referring to even cross table consistency? > > > -- Lars > > > ----- Original Message ----- > From: Lars George <[EMAIL PROTECTED]> > To: [EMAIL PROTECTED]; lars hofhansl <[EMAIL PROTECTED]> > Cc: > Sent: Thursday, April 12, 2012 11:13 PM > Subject: Re: HBase Replication use cases > > Hi Lars, > > I am really curious how you will handle the possible (or say likely) inconsistencies between regions of the same table in case of a DR situation. This seems to be solely applications layer logic but on the other hand a lot of people will need something here. So the question is, could this be added to the code? The idea is, could we hint to the replication what schema we are using and it can therefore handle shipping the logs somewhat "transactional" on the receiving end? For example, it could record sequence IDs or even timestamps and when the originating cluster fails there is a mechanism on the receiving end that deletes all inconsistent changes, bringing it back to a well known checkpoint. The replication does ship the WAL edits so, this might be all that is needed, and some ZooKeeper magic there to synchronize the checkpoint across the region servers? > > Maybe I am seeing this wrong here, but how else would you recover in the case of a DR situation? > > Cheers, > Lars > > On Apr 12, 2012, at 11:50 PM, lars hofhansl wrote: > >> Thanks Himanshu, >> >> we're planning to use Replication for cross DC replication for DR (and we added a bunch of stuff and fixed bugs in replication). >> >> >> We'll have it always on (and only use stop/start_peer, which is new in 0.94+ to temporarily stop replication, rather than stop/start_replication) >> HBASE-2611 is a problem. We did not have time recently to work on this. >> >> i) and ii) can be worked around by forcing a log roll on all region servers after replication was enabled. Replication would be considered started after the logs were >> rolled... But that is quite annoying. >> >> Is iii) still a problem in 0.92+? I thought we fixed that together with a). >> >> -- Lars >> >> ________________________________ >> From: Himanshu Vashishtha <[EMAIL PROTECTED]> >> To: [EMAIL PROTECTED] >> Sent: Thursday, April 12, 2012 12:11 PM >> Subject: HBase Replication use cases >> >> Hello All, >> >> I have been doing testing on the HBase replication (0.90.4, and 0.92 variants). >> >> Here are some of the findings: >> >> a) 0.90+ is not that great in handling out znode changes; in an >> ongoing replication, if I delete a peer and a region server goes to >> the znode to update the log status, the region server aborts itself >> when it sees a missing znode. >> >> Recoverable Zookeeper seems to have fix this in 0.92+? >> >> 0.92 has lot of new features (start/stop handle, master master, cyclic). >> >> But there are corner cases with the start/stop switches. >> i) A log is en-queue when the replication state is set to true. When we >> start the cluster, it is true and the starting region server takes the
-
Re: HBase Replication use casesLars George 2012-04-14, 10:38
Hi,
I was after table consistency. As soon as you break a entity group into more than one row you might have a problem that they span two regions. Now assume row 1 in region 1 is updated, but row 2 in region 2 is not because the replication lagged and now the originating cluster is a goner. I was asking how people handle this. Maybe your custom region split rules can help here to enforce rows of an entity group (I am using the Megastore notation here) are all on one server and therefore *should* be updated together. This would require also your cross row transactions using single WAL entries for multiple rows - that is what this does, right? But on larger scale, I was hoping that we could at recovery points into an entire table, so that when the replication stops and you are promoting the slave to be master, then you can delete all partial updates (sure, they are entire row updates, so how do you roll that back?) to ensure you have consistency. It is less about losing data, but more about keeping the table coherent. Is this not a concern for you guys? Do you have a schema that does not have this issue, for example, are you forcing all entity groups to be a single row? Just curious. Lars On Apr 14, 2012, at 2:31 AM, lars hofhansl wrote: > Ah yes. Good point. You're absolutely right. > > > > ----- Original Message ----- > From: Himanshu Vashishtha <[EMAIL PROTECTED]> > To: [EMAIL PROTECTED]; lars hofhansl <[EMAIL PROTECTED]> > Cc: > Sent: Friday, April 13, 2012 5:18 PM > Subject: Re: HBase Replication use cases > > @Lars H: > WalEdits are per transaction (aka per row). And we do ship waledits, > so at the slave, it will be end of some transaction the master have > seen? > > Idea is that the atomicity is a waledit. Can we have such a scenario > where end state does not correspond to the state at the end of any of > the row transactions in the master? It will be good to know. > > Thanks. > > > On Fri, Apr 13, 2012 at 4:54 PM, lars hofhansl <[EMAIL PROTECTED]> wrote: >> Hey Lars, >> >> in a DR scenario (i.e. a DC falls into the ocean) we SLAs that allow for a certain amount of data loss. >> The main concern here would be that "rows" could be in a state that does not correspond to the state at the end of any of the row transactions in the source system, right? >> >> Or are you referring to even cross table consistency? >> >> >> -- Lars >> >> >> ----- Original Message ----- >> From: Lars George <[EMAIL PROTECTED]> >> To: [EMAIL PROTECTED]; lars hofhansl <[EMAIL PROTECTED]> >> Cc: >> Sent: Thursday, April 12, 2012 11:13 PM >> Subject: Re: HBase Replication use cases >> >> Hi Lars, >> >> I am really curious how you will handle the possible (or say likely) inconsistencies between regions of the same table in case of a DR situation. This seems to be solely applications layer logic but on the other hand a lot of people will need something here. So the question is, could this be added to the code? The idea is, could we hint to the replication what schema we are using and it can therefore handle shipping the logs somewhat "transactional" on the receiving end? For example, it could record sequence IDs or even timestamps and when the originating cluster fails there is a mechanism on the receiving end that deletes all inconsistent changes, bringing it back to a well known checkpoint. The replication does ship the WAL edits so, this might be all that is needed, and some ZooKeeper magic there to synchronize the checkpoint across the region servers? >> >> Maybe I am seeing this wrong here, but how else would you recover in the case of a DR situation? >> >> Cheers, >> Lars >> >> On Apr 12, 2012, at 11:50 PM, lars hofhansl wrote: >> >>> Thanks Himanshu, >>> >>> we're planning to use Replication for cross DC replication for DR (and we added a bunch of stuff and fixed bugs in replication). >>> >>> >>> We'll have it always on (and only use stop/start_peer, which is new in 0.94+ to temporarily stop replication, rather than stop/start_replication)
-
Re: HBase Replication use caseslars hofhansl 2012-04-15, 21:01
Hey Lars,
this has not come up as a use case for us, yet. Since HBase does not provide cross table or cross row consistency anyway (but 0.94 will have HBASE-5229), the application cannot expect such consistencies anyway; we could at best make this timestamp-consistent. -- Lars ________________________________ From: Lars George <[EMAIL PROTECTED]> To: [EMAIL PROTECTED]; lars hofhansl <[EMAIL PROTECTED]> Cc: Himanshu Vashishtha <[EMAIL PROTECTED]> Sent: Saturday, April 14, 2012 3:38 AM Subject: Re: HBase Replication use cases Hi, I was after table consistency. As soon as you break a entity group into more than one row you might have a problem that they span two regions. Now assume row 1 in region 1 is updated, but row 2 in region 2 is not because the replication lagged and now the originating cluster is a goner. I was asking how people handle this. Maybe your custom region split rules can help here to enforce rows of an entity group (I am using the Megastore notation here) are all on one server and therefore *should* be updated together. This would require also your cross row transactions using single WAL entries for multiple rows - that is what this does, right? But on larger scale, I was hoping that we could at recovery points into an entire table, so that when the replication stops and you are promoting the slave to be master, then you can delete all partial updates (sure, they are entire row updates, so how do you roll that back?) to ensure you have consistency. It is less about losing data, but more about keeping the table coherent. Is this not a concern for you guys? Do you have a schema that does not have this issue, for example, are you forcing all entity groups to be a single row? Just curious. Lars On Apr 14, 2012, at 2:31 AM, lars hofhansl wrote: > Ah yes. Good point. You're absolutely right. > > > > ----- Original Message ----- > From: Himanshu Vashishtha <[EMAIL PROTECTED]> > To: [EMAIL PROTECTED]; lars hofhansl <[EMAIL PROTECTED]> > Cc: > Sent: Friday, April 13, 2012 5:18 PM > Subject: Re: HBase Replication use cases > > @Lars H: > WalEdits are per transaction (aka per row). And we do ship waledits, > so at the slave, it will be end of some transaction the master have > seen? > > Idea is that the atomicity is a waledit. Can we have such a scenario > where end state does not correspond to the state at the end of any of > the row transactions in the master? It will be good to know. > > Thanks. > > > On Fri, Apr 13, 2012 at 4:54 PM, lars hofhansl <[EMAIL PROTECTED]> wrote: >> Hey Lars, >> >> in a DR scenario (i.e. a DC falls into the ocean) we SLAs that allow for a certain amount of data loss. >> The main concern here would be that "rows" could be in a state that does not correspond to the state at the end of any of the row transactions in the source system, right? >> >> Or are you referring to even cross table consistency? >> >> >> -- Lars >> >> >> ----- Original Message ----- >> From: Lars George <[EMAIL PROTECTED]> >> To: [EMAIL PROTECTED]; lars hofhansl <[EMAIL PROTECTED]> >> Cc: >> Sent: Thursday, April 12, 2012 11:13 PM >> Subject: Re: HBase Replication use cases >> >> Hi Lars, >> >> I am really curious how you will handle the possible (or say likely) inconsistencies between regions of the same table in case of a DR situation. This seems to be solely applications layer logic but on the other hand a lot of people will need something here. So the question is, could this be added to the code? The idea is, could we hint to the replication what schema we are using and it can therefore handle shipping the logs somewhat "transactional" on the receiving end? For example, it could record sequence IDs or even timestamps and when the originating cluster fails there is a mechanism on the receiving end that deletes all inconsistent changes, bringing it back to a well known checkpoint. The replication does ship the WAL edits so, this might be all that is needed, and some ZooKeeper magic there to synchronize the checkpoint across the region servers? |