- Downloaded tarball and verified signature and checksum. - Deployed a 3-node cluster. - Tested various zkCli commands. - Killed the leader and verified a new leader was elected among the remaining 2 nodes. - ZOOKEEPER-1660: Tested dynamic reconfiguration by following this new documentation.
Thank you for putting together the release, Michi!
I think ZOOKEEPER-1506 could be problematic for some setups. After a couple of elections with a cluster of 5 participants and one observer, I end up with a participant that's unable to find the leader because it does a reverse lookup (IP -> hostname) and ends up with a bogus hostname that it can't resolve:
I don't think the reverse lookup from QuorumCnxManager was done before, nor that it should be done. So it could cause issues in places where reverse lookups aren't fully working. Surely, we could argue that it's a DNS setup issue but I think we should avoid the extra lookup if possible.
I'll dig in a bit deeper and try to come with a deterministic repro. -rgs On 12 April 2015 at 14:58, Michi Mutsuzaki <[EMAIL PROTECTED]> wrote:
On 20 April 2015 at 13:18, Flavio Junqueira <[EMAIL PROTECTED]lid> wrote: Done - I'll post my (hopefully reproducible) setup in a bit. I guess that patch might be triggering reverse lookups as an (undesired) side effect. -rgs
Thanks Raul. I'd like to include the fix ( https://issues.apache.org/jira/browse/ZOOKEEPER-2171 ) in 3.5.1. I'll create another candidate once the issue is resolved. In the meantime, please let me know if you guys have any other feedback regarding this release candidate.
On Mon, Apr 20, 2015 at 1:44 PM, Raúl Gutiérrez Segalés <[EMAIL PROTECTED]> wrote:
On 20 April 2015 at 13:03, Raúl Gutiérrez Segalés <[EMAIL PROTECTED]> wrote: Commented on ZOOKEEPER-1506: turns out that my issue was with reverse lookup calls that were not introduced by that patch. They seem to have been introduced by ZOOKEEPER-107, so they have been around for a while.
The tl;dr is that if your resolvers give you bad reverse names, you'll have issues. It would nice to avoid these reverse lookups, so I created:
* many elections (which look quick) * creating and deleting ephemerals in a loop (via zk-shell) * phunt's smoke test scripts (comparable results to 3.5.0) * partitioning and unpartioning an attached observer * use zktraffic's fle-dump & zab-dump to inspect if there were any bogus FLE votes or ZAB messages 
It looks like we have couple of jira issues ready to check in to the branch-3.5 like, ZOOKEEPER-2174, ZOOKEEPER-2062 etc
But these are not blockers for 3.5.1 release, should we wait for the 3.5.1 release and then push/commit these kinda issues into the project ?
FYI: Presently we have only two issues marked for 3.5.1 -> ZOOKEEPER-2171(required) and ZOOKEEPER-2124. Thanks & Regards, Rakesh On Tue, Apr 28, 2015 at 5:00 AM, Michi Mutsuzaki <[EMAIL PROTECTED]> wrote:
On 2 May 2015 at 15:45, Patrick Hunt <[EMAIL PROTECTED]> wrote: fwiw, ZOOKEEPER-2171 has a +1 from Rakesh and is ready to be merged (though an unrelated build/test failure happened after I updated it to address some last nits/details). I just gave ZOOKEEPER-2124 a +1, so it can probably be merged. -rgs