|
|
-
zookeeper cluster spanning datacenters
Damu R 2011-09-22, 14:50
Hi, I would like to know the downsides of having a zookeeper cluster that spans multiple datacenters. The requirement is a datacenter failure should not bring down the zookeeper cluster. From my understanding it is not possible to have a hot/cold cluster kind of setup possible. So we are thinking of putting zk servers in 3 colos(1+1+1 or 2+2+3). One of the major drawback I could think of is the throughput of the system affected by latency. The system does not require high throughput and can accept some latency. How much effect will the latency have on the throughput of the system? What are the other downsides of spreading the cluster across datacenters?
Regards Damu
-
RE: zookeeper cluster spanning datacenters
Fournier, Camille F. 2011-09-22, 15:03
We spread our ZKs across 3 data centers and in fact, these data centers are split across global regions (2 or 4 in one region, one in a remote region). To keep throughput up (and note that the throughput you have to worry about is only write throughput), we always ensure that the master is in one of the "local" data centers. If you have a very write-heavy and write time sensitive load, this might affect your performance. It won't affect reads at all because reads are serviced from the memory of the zk you connect to. For a mostly read-intensive load, splitting across data centers is unlikely to cause you problems. There is one exception: Monitoring. Even across data centers in the same region, we sometimes see zk dashboard unable to properly monitor the leader of a heavily-utilized cluster. This is due to the way the 4lw connections are managed, and something I'm trying to fix. If you have the machines to test, I would recommend running zk-smoketest ( https://github.com/phunt/zk-smoketest) on the proposed config. C -----Original Message----- From: Damu R [mailto:[EMAIL PROTECTED]] Sent: Thursday, September 22, 2011 10:50 AM To: [EMAIL PROTECTED] Subject: zookeeper cluster spanning datacenters Hi, I would like to know the downsides of having a zookeeper cluster that spans multiple datacenters. The requirement is a datacenter failure should not bring down the zookeeper cluster. From my understanding it is not possible to have a hot/cold cluster kind of setup possible. So we are thinking of putting zk servers in 3 colos(1+1+1 or 2+2+3). One of the major drawback I could think of is the throughput of the system affected by latency. The system does not require high throughput and can accept some latency. How much effect will the latency have on the throughput of the system? What are the other downsides of spreading the cluster across datacenters? Regards Damu
-
Re: zookeeper cluster spanning datacenters
Ted Dunning 2011-09-22, 15:27
One additional architecture that has been proposed for people with only 2 data centers is to put 2+2 machines in the data centers and then put a tie breaker in EC2. On Thu, Sep 22, 2011 at 8:03 AM, Fournier, Camille F. < [EMAIL PROTECTED]> wrote: > We spread our ZKs across 3 data centers and in fact, these data centers are > split across global regions (2 or 4 in one region, one in a remote region). > To keep throughput up (and note that the throughput you have to worry about > is only write throughput), we always ensure that the master is in one of the > "local" data centers. > > If you have a very write-heavy and write time sensitive load, this might > affect your performance. It won't affect reads at all because reads are > serviced from the memory of the zk you connect to. For a mostly > read-intensive load, splitting across data centers is unlikely to cause you > problems. > > There is one exception: Monitoring. Even across data centers in the same > region, we sometimes see zk dashboard unable to properly monitor the leader > of a heavily-utilized cluster. This is due to the way the 4lw connections > are managed, and something I'm trying to fix. > > If you have the machines to test, I would recommend running zk-smoketest ( > https://github.com/phunt/zk-smoketest) on the proposed config. > > C > > -----Original Message----- > From: Damu R [mailto:[EMAIL PROTECTED]] > Sent: Thursday, September 22, 2011 10:50 AM > To: [EMAIL PROTECTED] > Subject: zookeeper cluster spanning datacenters > > Hi, > I would like to know the downsides of having a zookeeper cluster that spans > multiple datacenters. The requirement is a datacenter failure should not > bring down the zookeeper cluster. From my understanding it is not possible > to have a hot/cold cluster kind of setup possible. So we are thinking of > putting zk servers in 3 colos(1+1+1 or 2+2+3). One of the major drawback I > could think of is the throughput of the system affected by latency. The > system does not require high throughput and can accept some latency. How > much effect will the latency have on the throughput of the system? What are > the other downsides of spreading the cluster across datacenters? > > Regards > Damu >
-
Re: zookeeper cluster spanning datacenters
kishore g 2011-09-22, 16:13
This is an interesting topic, is there a place where we can get various possible setup, pros and cons of each and what kind of use case works/does not work. thanks, Kishore G On Thu, Sep 22, 2011 at 8:27 AM, Ted Dunning <[EMAIL PROTECTED]> wrote: > One additional architecture that has been proposed for people with only 2 > data centers is to put 2+2 machines in the data centers and then put a tie > breaker in EC2. > > On Thu, Sep 22, 2011 at 8:03 AM, Fournier, Camille F. < > [EMAIL PROTECTED]> wrote: > > > We spread our ZKs across 3 data centers and in fact, these data centers > are > > split across global regions (2 or 4 in one region, one in a remote > region). > > To keep throughput up (and note that the throughput you have to worry > about > > is only write throughput), we always ensure that the master is in one of > the > > "local" data centers. > > > > If you have a very write-heavy and write time sensitive load, this might > > affect your performance. It won't affect reads at all because reads are > > serviced from the memory of the zk you connect to. For a mostly > > read-intensive load, splitting across data centers is unlikely to cause > you > > problems. > > > > There is one exception: Monitoring. Even across data centers in the same > > region, we sometimes see zk dashboard unable to properly monitor the > leader > > of a heavily-utilized cluster. This is due to the way the 4lw connections > > are managed, and something I'm trying to fix. > > > > If you have the machines to test, I would recommend running zk-smoketest > ( > > https://github.com/phunt/zk-smoketest) on the proposed config. > > > > C > > > > -----Original Message----- > > From: Damu R [mailto:[EMAIL PROTECTED]] > > Sent: Thursday, September 22, 2011 10:50 AM > > To: [EMAIL PROTECTED] > > Subject: zookeeper cluster spanning datacenters > > > > Hi, > > I would like to know the downsides of having a zookeeper cluster that > spans > > multiple datacenters. The requirement is a datacenter failure should not > > bring down the zookeeper cluster. From my understanding it is not > possible > > to have a hot/cold cluster kind of setup possible. So we are thinking of > > putting zk servers in 3 colos(1+1+1 or 2+2+3). One of the major drawback > I > > could think of is the throughput of the system affected by latency. The > > system does not require high throughput and can accept some latency. How > > much effect will the latency have on the throughput of the system? What > are > > the other downsides of spreading the cluster across datacenters? > > > > Regards > > Damu > > >
-
Re: zookeeper cluster spanning datacenters
Ted Dunning 2011-09-22, 16:15
If the wiki doesn't have enough details for you, put questions on pages that need more details or start new pages with an outline of what you think would help. On Thu, Sep 22, 2011 at 9:13 AM, kishore g <[EMAIL PROTECTED]> wrote: > This is an interesting topic, is there a place where we can get various > possible setup, pros and cons of each and what kind of use case works/does > not work. > > thanks, > Kishore G > > On Thu, Sep 22, 2011 at 8:27 AM, Ted Dunning <[EMAIL PROTECTED]> > wrote: > > > One additional architecture that has been proposed for people with only 2 > > data centers is to put 2+2 machines in the data centers and then put a > tie > > breaker in EC2. > > > > On Thu, Sep 22, 2011 at 8:03 AM, Fournier, Camille F. < > > [EMAIL PROTECTED]> wrote: > > > > > We spread our ZKs across 3 data centers and in fact, these data centers > > are > > > split across global regions (2 or 4 in one region, one in a remote > > region). > > > To keep throughput up (and note that the throughput you have to worry > > about > > > is only write throughput), we always ensure that the master is in one > of > > the > > > "local" data centers. > > > > > > If you have a very write-heavy and write time sensitive load, this > might > > > affect your performance. It won't affect reads at all because reads are > > > serviced from the memory of the zk you connect to. For a mostly > > > read-intensive load, splitting across data centers is unlikely to cause > > you > > > problems. > > > > > > There is one exception: Monitoring. Even across data centers in the > same > > > region, we sometimes see zk dashboard unable to properly monitor the > > leader > > > of a heavily-utilized cluster. This is due to the way the 4lw > connections > > > are managed, and something I'm trying to fix. > > > > > > If you have the machines to test, I would recommend running > zk-smoketest > > ( > > > https://github.com/phunt/zk-smoketest) on the proposed config. > > > > > > C > > > > > > -----Original Message----- > > > From: Damu R [mailto:[EMAIL PROTECTED]] > > > Sent: Thursday, September 22, 2011 10:50 AM > > > To: [EMAIL PROTECTED] > > > Subject: zookeeper cluster spanning datacenters > > > > > > Hi, > > > I would like to know the downsides of having a zookeeper cluster that > > spans > > > multiple datacenters. The requirement is a datacenter failure should > not > > > bring down the zookeeper cluster. From my understanding it is not > > possible > > > to have a hot/cold cluster kind of setup possible. So we are thinking > of > > > putting zk servers in 3 colos(1+1+1 or 2+2+3). One of the major > drawback > > I > > > could think of is the throughput of the system affected by latency. The > > > system does not require high throughput and can accept some latency. > How > > > much effect will the latency have on the throughput of the system? What > > are > > > the other downsides of spreading the cluster across datacenters? > > > > > > Regards > > > Damu > > > > > >
-
Re: zookeeper cluster spanning datacenters
Damu R 2011-09-22, 16:46
Hi Fourier, We spread our ZKs across 3 data centers and in fact, these data centers are > split across global regions (2 or 4 in one region, one in a remote region). > To keep throughput up (and note that the throughput you have to worry about > is only write throughput), we always ensure that the master is in one of the > "local" data centers. > How can we make sure the master (leader?) is in the local datacenter? Is there any way we can control the leader election? > If you have the machines to test, I would recommend running zk-smoketest ( > https://github.com/phunt/zk-smoketest) on the proposed config. > This tool will be very useful to evaluate the setup. Thanks Damu > > C > > -----Original Message----- > From: Damu R [mailto:[EMAIL PROTECTED]] > Sent: Thursday, September 22, 2011 10:50 AM > To: [EMAIL PROTECTED] > Subject: zookeeper cluster spanning datacenters > > Hi, > I would like to know the downsides of having a zookeeper cluster that spans > multiple datacenters. The requirement is a datacenter failure should not > bring down the zookeeper cluster. From my understanding it is not possible > to have a hot/cold cluster kind of setup possible. So we are thinking of > putting zk servers in 3 colos(1+1+1 or 2+2+3). One of the major drawback I > could think of is the throughput of the system affected by latency. The > system does not require high throughput and can accept some latency. How > much effect will the latency have on the throughput of the system? What are > the other downsides of spreading the cluster across datacenters? > > Regards > Damu >
-
RE: zookeeper cluster spanning datacenters
Fournier, Camille F. 2011-09-22, 17:26
We have a monitor process that runs 'stat' against the remote ZK and if it returns leader, kills the process. -----Original Message----- From: Damu R [mailto:[EMAIL PROTECTED]] Sent: Thursday, September 22, 2011 12:47 PM To: [EMAIL PROTECTED] Subject: Re: zookeeper cluster spanning datacenters Hi Fourier, We spread our ZKs across 3 data centers and in fact, these data centers are > split across global regions (2 or 4 in one region, one in a remote region). > To keep throughput up (and note that the throughput you have to worry about > is only write throughput), we always ensure that the master is in one of the > "local" data centers. > How can we make sure the master (leader?) is in the local datacenter? Is there any way we can control the leader election? > If you have the machines to test, I would recommend running zk-smoketest ( > https://github.com/phunt/zk-smoketest) on the proposed config. > This tool will be very useful to evaluate the setup. Thanks Damu > > C > > -----Original Message----- > From: Damu R [mailto:[EMAIL PROTECTED]] > Sent: Thursday, September 22, 2011 10:50 AM > To: [EMAIL PROTECTED] > Subject: zookeeper cluster spanning datacenters > > Hi, > I would like to know the downsides of having a zookeeper cluster that spans > multiple datacenters. The requirement is a datacenter failure should not > bring down the zookeeper cluster. From my understanding it is not possible > to have a hot/cold cluster kind of setup possible. So we are thinking of > putting zk servers in 3 colos(1+1+1 or 2+2+3). One of the major drawback I > could think of is the throughput of the system affected by latency. The > system does not require high throughput and can accept some latency. How > much effect will the latency have on the throughput of the system? What are > the other downsides of spreading the cluster across datacenters? > > Regards > Damu >
-
Re: zookeeper cluster spanning datacenters
Ted Dunning 2011-09-22, 19:01
Wow. Brutal, but effective. On Thu, Sep 22, 2011 at 10:26 AM, Fournier, Camille F. < [EMAIL PROTECTED]> wrote: > We have a monitor process that runs 'stat' against the remote ZK and if it > returns leader, kills the process. > > -----Original Message----- > From: Damu R [mailto:[EMAIL PROTECTED]] > Sent: Thursday, September 22, 2011 12:47 PM > To: [EMAIL PROTECTED] > Subject: Re: zookeeper cluster spanning datacenters > > Hi Fourier, > > We spread our ZKs across 3 data centers and in fact, these data centers are > > split across global regions (2 or 4 in one region, one in a remote > region). > > To keep throughput up (and note that the throughput you have to worry > about > > is only write throughput), we always ensure that the master is in one of > the > > "local" data centers. > > > How can we make sure the master (leader?) is in the local datacenter? Is > there any way we can control the leader election? > > > > If you have the machines to test, I would recommend running zk-smoketest > ( > > https://github.com/phunt/zk-smoketest) on the proposed config. > > > This tool will be very useful to evaluate the setup. > > Thanks > Damu > > > > > C > > > > -----Original Message----- > > From: Damu R [mailto:[EMAIL PROTECTED]] > > Sent: Thursday, September 22, 2011 10:50 AM > > To: [EMAIL PROTECTED] > > Subject: zookeeper cluster spanning datacenters > > > > Hi, > > I would like to know the downsides of having a zookeeper cluster that > spans > > multiple datacenters. The requirement is a datacenter failure should not > > bring down the zookeeper cluster. From my understanding it is not > possible > > to have a hot/cold cluster kind of setup possible. So we are thinking of > > putting zk servers in 3 colos(1+1+1 or 2+2+3). One of the major drawback > I > > could think of is the throughput of the system affected by latency. The > > system does not require high throughput and can accept some latency. How > > much effect will the latency have on the throughput of the system? What > are > > the other downsides of spreading the cluster across datacenters? > > > > Regards > > Damu > > >
-
Re: zookeeper cluster spanning datacenters
Vishal Kher 2011-09-22, 20:45
Hi Camille, This is very interesting. Can you give more info on your setup? - Network connectivity (bandwidth and latency) that you have between the data centers? How much of the bandwidth is available for ZK? - What are the timeout (server and client session timeout) values that you use? How much latency are the applications willing to tolerate? We are thinking of running ZK across data centers as well and it will be great to see how others are resolving some of these problems. Thanks. -Vishal On Thu, Sep 22, 2011 at 11:03 AM, Fournier, Camille F. < [EMAIL PROTECTED]> wrote: > We spread our ZKs across 3 data centers and in fact, these data centers are > split across global regions (2 or 4 in one region, one in a remote region). > To keep throughput up (and note that the throughput you have to worry about > is only write throughput), we always ensure that the master is in one of the > "local" data centers. > > If you have a very write-heavy and write time sensitive load, this might > affect your performance. It won't affect reads at all because reads are > serviced from the memory of the zk you connect to. For a mostly > read-intensive load, splitting across data centers is unlikely to cause you > problems. > > There is one exception: Monitoring. Even across data centers in the same > region, we sometimes see zk dashboard unable to properly monitor the leader > of a heavily-utilized cluster. This is due to the way the 4lw connections > are managed, and something I'm trying to fix. > > If you have the machines to test, I would recommend running zk-smoketest ( > https://github.com/phunt/zk-smoketest) on the proposed config. > > C > > -----Original Message----- > From: Damu R [mailto:[EMAIL PROTECTED]] > Sent: Thursday, September 22, 2011 10:50 AM > To: [EMAIL PROTECTED] > Subject: zookeeper cluster spanning datacenters > > Hi, > I would like to know the downsides of having a zookeeper cluster that spans > multiple datacenters. The requirement is a datacenter failure should not > bring down the zookeeper cluster. From my understanding it is not possible > to have a hot/cold cluster kind of setup possible. So we are thinking of > putting zk servers in 3 colos(1+1+1 or 2+2+3). One of the major drawback I > could think of is the throughput of the system affected by latency. The > system does not require high throughput and can accept some latency. How > much effect will the latency have on the throughput of the system? What are > the other downsides of spreading the cluster across datacenters? > > Regards > Damu >
-
Re: zookeeper cluster spanning datacenters
Mahadev Konar 2011-09-22, 20:53
Better still put it up on a wiki on https://cwiki.apache.org/confluence/display/ZOOKEEPER/Indexthanks mahadev On Sep 22, 2011, at 1:45 PM, Vishal Kher wrote: > Hi Camille, > > This is very interesting. > > Can you give more info on your setup? > - Network connectivity (bandwidth and latency) that you have between the > data centers? How much of the bandwidth is available for ZK? > - What are the timeout (server and client session timeout) values that you > use? How much latency are the applications willing to tolerate? > > We are thinking of running ZK across data centers as well and it will be > great to see how others are resolving some of these problems. > > Thanks. > -Vishal > > On Thu, Sep 22, 2011 at 11:03 AM, Fournier, Camille F. < > [EMAIL PROTECTED]> wrote: > >> We spread our ZKs across 3 data centers and in fact, these data centers are >> split across global regions (2 or 4 in one region, one in a remote region). >> To keep throughput up (and note that the throughput you have to worry about >> is only write throughput), we always ensure that the master is in one of the >> "local" data centers. >> >> If you have a very write-heavy and write time sensitive load, this might >> affect your performance. It won't affect reads at all because reads are >> serviced from the memory of the zk you connect to. For a mostly >> read-intensive load, splitting across data centers is unlikely to cause you >> problems. >> >> There is one exception: Monitoring. Even across data centers in the same >> region, we sometimes see zk dashboard unable to properly monitor the leader >> of a heavily-utilized cluster. This is due to the way the 4lw connections >> are managed, and something I'm trying to fix. >> >> If you have the machines to test, I would recommend running zk-smoketest ( >> https://github.com/phunt/zk-smoketest) on the proposed config. >> >> C >> >> -----Original Message----- >> From: Damu R [mailto:[EMAIL PROTECTED]] >> Sent: Thursday, September 22, 2011 10:50 AM >> To: [EMAIL PROTECTED] >> Subject: zookeeper cluster spanning datacenters >> >> Hi, >> I would like to know the downsides of having a zookeeper cluster that spans >> multiple datacenters. The requirement is a datacenter failure should not >> bring down the zookeeper cluster. From my understanding it is not possible >> to have a hot/cold cluster kind of setup possible. So we are thinking of >> putting zk servers in 3 colos(1+1+1 or 2+2+3). One of the major drawback I >> could think of is the throughput of the system affected by latency. The >> system does not require high throughput and can accept some latency. How >> much effect will the latency have on the throughput of the system? What are >> the other downsides of spreading the cluster across datacenters? >> >> Regards >> Damu >>
-
Re: zookeeper cluster spanning datacenters
Flavio Junqueira 2011-09-23, 09:12
One quick comment. We do not require majority quorums in ZooKeeper, and one reason we implemented this feature was exactly to enable more flexibility in deployments with multiple data centers. Flexible quorums are not supposed to give you the ability of always having all voting replicas in a single data center, but depending on the number of data centers you're using, it could give you fewer cross-dc messages per transaction. I was actually wondering if with the new reconfiguration feature coming up we will be able to change weights of servers in an online fashion. -Flavio On Sep 22, 2011, at 10:53 PM, Mahadev Konar wrote: > Better still put it up on a wiki on > > https://cwiki.apache.org/confluence/display/ZOOKEEPER/Index> > thanks > mahadev > > On Sep 22, 2011, at 1:45 PM, Vishal Kher wrote: > >> Hi Camille, >> >> This is very interesting. >> >> Can you give more info on your setup? >> - Network connectivity (bandwidth and latency) that you have >> between the >> data centers? How much of the bandwidth is available for ZK? >> - What are the timeout (server and client session timeout) values >> that you >> use? How much latency are the applications willing to tolerate? >> >> We are thinking of running ZK across data centers as well and it >> will be >> great to see how others are resolving some of these problems. >> >> Thanks. >> -Vishal >> >> On Thu, Sep 22, 2011 at 11:03 AM, Fournier, Camille F. < >> [EMAIL PROTECTED]> wrote: >> >>> We spread our ZKs across 3 data centers and in fact, these data >>> centers are >>> split across global regions (2 or 4 in one region, one in a remote >>> region). >>> To keep throughput up (and note that the throughput you have to >>> worry about >>> is only write throughput), we always ensure that the master is in >>> one of the >>> "local" data centers. >>> >>> If you have a very write-heavy and write time sensitive load, this >>> might >>> affect your performance. It won't affect reads at all because >>> reads are >>> serviced from the memory of the zk you connect to. For a mostly >>> read-intensive load, splitting across data centers is unlikely to >>> cause you >>> problems. >>> >>> There is one exception: Monitoring. Even across data centers in >>> the same >>> region, we sometimes see zk dashboard unable to properly monitor >>> the leader >>> of a heavily-utilized cluster. This is due to the way the 4lw >>> connections >>> are managed, and something I'm trying to fix. >>> >>> If you have the machines to test, I would recommend running zk- >>> smoketest ( >>> https://github.com/phunt/zk-smoketest) on the proposed config. >>> >>> C >>> >>> -----Original Message----- >>> From: Damu R [mailto:[EMAIL PROTECTED]] >>> Sent: Thursday, September 22, 2011 10:50 AM >>> To: [EMAIL PROTECTED] >>> Subject: zookeeper cluster spanning datacenters >>> >>> Hi, >>> I would like to know the downsides of having a zookeeper cluster >>> that spans >>> multiple datacenters. The requirement is a datacenter failure >>> should not >>> bring down the zookeeper cluster. From my understanding it is not >>> possible >>> to have a hot/cold cluster kind of setup possible. So we are >>> thinking of >>> putting zk servers in 3 colos(1+1+1 or 2+2+3). One of the major >>> drawback I >>> could think of is the throughput of the system affected by >>> latency. The >>> system does not require high throughput and can accept some >>> latency. How >>> much effect will the latency have on the throughput of the system? >>> What are >>> the other downsides of spreading the cluster across datacenters? >>> >>> Regards >>> Damu >>> > flavio junqueira research scientist [EMAIL PROTECTED] direct +34 93-183-8828 avinguda diagonal 177, 8th floor, barcelona, 08018, es phone (408) 349 3300 fax (408) 349 3301
|
|