|
Patrick Hunt
2011-12-28, 17:45
Camille Fournier
2011-12-28, 18:12
Camille Fournier
2011-12-28, 18:31
Camille Fournier
2011-12-28, 19:18
Patrick Hunt
2011-12-28, 19:24
Ted Yu
2011-12-28, 20:58
Patrick Hunt
2011-12-28, 21:04
Camille Fournier
2011-12-28, 21:10
Patrick Hunt
2011-12-28, 21:24
Camille Fournier
2011-12-28, 21:26
Patrick Hunt
2011-12-28, 21:32
Camille Fournier
2011-12-28, 21:53
Patrick Hunt
2011-12-28, 22:12
Camille Fournier
2011-12-28, 22:21
Patrick Hunt
2011-12-28, 22:45
|
-
Re: svn commit: r1225200 - in /zookeeper/trunk: ./ src/java/main/org/apache/zookeeper/server/ src/java/test/org/apache/zookeeper/server/quorum/ src/java/test/org/apache/zookeeper/test/Patrick Hunt 2011-12-28, 17:45
I believe there is a bug in this commit. The "cnxns" size() call is
not being synchronized. This will lead to invalid results at best, at worst outright failure (hard to say w/o knowing the implementation of HashSet). Camille can you work with Neha to get this fixed? Perhaps in the meantime (if it's going to take a while) you can revert this change, re-open the jira, update the patch, and reapply at some later time? Patrick On Wed, Dec 28, 2011 at 6:55 AM, <[EMAIL PROTECTED]> wrote: > Author: camille > Date: Wed Dec 28 14:55:37 2011 > New Revision: 1225200 > > URL: http://svn.apache.org/viewvc?rev=1225200&view=rev > Log: > ZOOKEEPER-1321: Add number of client connections metric in JMX and srvr (Neha Narkhede via camille) > > Modified: > zookeeper/trunk/CHANGES.txt > zookeeper/trunk/src/java/main/org/apache/zookeeper/server/NIOServerCnxn.java > zookeeper/trunk/src/java/main/org/apache/zookeeper/server/NIOServerCnxnFactory.java > zookeeper/trunk/src/java/main/org/apache/zookeeper/server/NettyServerCnxn.java > zookeeper/trunk/src/java/main/org/apache/zookeeper/server/NettyServerCnxnFactory.java > zookeeper/trunk/src/java/main/org/apache/zookeeper/server/ServerCnxnFactory.java > zookeeper/trunk/src/java/main/org/apache/zookeeper/server/ServerStats.java > zookeeper/trunk/src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java > zookeeper/trunk/src/java/main/org/apache/zookeeper/server/ZooKeeperServerBean.java > zookeeper/trunk/src/java/main/org/apache/zookeeper/server/ZooKeeperServerMXBean.java > zookeeper/trunk/src/java/test/org/apache/zookeeper/server/quorum/Zab1_0Test.java > zookeeper/trunk/src/java/test/org/apache/zookeeper/test/FourLetterWordsTest.java > > Modified: zookeeper/trunk/CHANGES.txt > URL: http://svn.apache.org/viewvc/zookeeper/trunk/CHANGES.txt?rev=1225200&r1=1225199&r2=1225200&view=diff > =============================================================================> --- zookeeper/trunk/CHANGES.txt (original) > +++ zookeeper/trunk/CHANGES.txt Wed Dec 28 14:55:37 2011 > @@ -162,6 +162,8 @@ IMPROVEMENTS: > > ZOOKEEPER-1342. quorum Listener & LearnerCnxAcceptor are missing > thread names (Rakesh R via phunt) > + > + ZOOKEEPER-1321. Add number of client connections metric in JMX and srvr (Neha Narkhede via camille) > > Release 3.4.0 - > > > Modified: zookeeper/trunk/src/java/main/org/apache/zookeeper/server/NIOServerCnxn.java > URL: http://svn.apache.org/viewvc/zookeeper/trunk/src/java/main/org/apache/zookeeper/server/NIOServerCnxn.java?rev=1225200&r1=1225199&r2=1225200&view=diff > =============================================================================> --- zookeeper/trunk/src/java/main/org/apache/zookeeper/server/NIOServerCnxn.java (original) > +++ zookeeper/trunk/src/java/main/org/apache/zookeeper/server/NIOServerCnxn.java Wed Dec 28 14:55:37 2011 > @@ -749,7 +749,8 @@ public class NIOServerCnxn extends Serve > > print("packets_received", stats.getPacketsReceived()); > print("packets_sent", stats.getPacketsSent()); > - > + print("num_alive_connections", stats.getNumAliveClientConnections()); > + > print("outstanding_requests", stats.getOutstandingRequests()); > > print("server_state", stats.getServerState()); > > Modified: zookeeper/trunk/src/java/main/org/apache/zookeeper/server/NIOServerCnxnFactory.java > URL: http://svn.apache.org/viewvc/zookeeper/trunk/src/java/main/org/apache/zookeeper/server/NIOServerCnxnFactory.java?rev=1225200&r1=1225199&r2=1225200&view=diff > =============================================================================> --- zookeeper/trunk/src/java/main/org/apache/zookeeper/server/NIOServerCnxnFactory.java (original) > +++ zookeeper/trunk/src/java/main/org/apache/zookeeper/server/NIOServerCnxnFactory.java Wed Dec 28 14:55:37 2011 > @@ -32,14 +32,14 @@ import java.util.HashMap; > import java.util.HashSet; > import java.util.Set; > > +import javax.security.auth.login.Configuration; +
Patrick Hunt 2011-12-28, 17:45
-
Re: svn commit: r1225200 - in /zookeeper/trunk: ./ src/java/main/org/apache/zookeeper/server/ src/java/test/org/apache/zookeeper/server/quorum/ src/java/test/org/apache/zookeeper/test/Camille Fournier 2011-12-28, 18:12
I'm not sure it's meaningful enough to be worth the sync overhead. We
should look into that. The alternative if we must is using a proper concurrent collection. Neha, any thoughts? C On Dec 28, 2011 12:45 PM, "Patrick Hunt" <[EMAIL PROTECTED]> wrote: > I believe there is a bug in this commit. The "cnxns" size() call is > not being synchronized. This will lead to invalid results at best, at > worst outright failure (hard to say w/o knowing the implementation of > HashSet). > > Camille can you work with Neha to get this fixed? Perhaps in the > meantime (if it's going to take a while) you can revert this change, > re-open the jira, update the patch, and reapply at some later time? > > Patrick > > On Wed, Dec 28, 2011 at 6:55 AM, <[EMAIL PROTECTED]> wrote: > > Author: camille > > Date: Wed Dec 28 14:55:37 2011 > > New Revision: 1225200 > > > > URL: http://svn.apache.org/viewvc?rev=1225200&view=rev > > Log: > > ZOOKEEPER-1321: Add number of client connections metric in JMX and srvr > (Neha Narkhede via camille) > > > > Modified: > > zookeeper/trunk/CHANGES.txt > > > zookeeper/trunk/src/java/main/org/apache/zookeeper/server/NIOServerCnxn.java > > > zookeeper/trunk/src/java/main/org/apache/zookeeper/server/NIOServerCnxnFactory.java > > > zookeeper/trunk/src/java/main/org/apache/zookeeper/server/NettyServerCnxn.java > > > zookeeper/trunk/src/java/main/org/apache/zookeeper/server/NettyServerCnxnFactory.java > > > zookeeper/trunk/src/java/main/org/apache/zookeeper/server/ServerCnxnFactory.java > > > zookeeper/trunk/src/java/main/org/apache/zookeeper/server/ServerStats.java > > > zookeeper/trunk/src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java > > > zookeeper/trunk/src/java/main/org/apache/zookeeper/server/ZooKeeperServerBean.java > > > zookeeper/trunk/src/java/main/org/apache/zookeeper/server/ZooKeeperServerMXBean.java > > > zookeeper/trunk/src/java/test/org/apache/zookeeper/server/quorum/Zab1_0Test.java > > > zookeeper/trunk/src/java/test/org/apache/zookeeper/test/FourLetterWordsTest.java > > > > Modified: zookeeper/trunk/CHANGES.txt > > URL: > http://svn.apache.org/viewvc/zookeeper/trunk/CHANGES.txt?rev=1225200&r1=1225199&r2=1225200&view=diff > > > =============================================================================> > --- zookeeper/trunk/CHANGES.txt (original) > > +++ zookeeper/trunk/CHANGES.txt Wed Dec 28 14:55:37 2011 > > @@ -162,6 +162,8 @@ IMPROVEMENTS: > > > > ZOOKEEPER-1342. quorum Listener & LearnerCnxAcceptor are missing > > thread names (Rakesh R via phunt) > > + > > + ZOOKEEPER-1321. Add number of client connections metric in JMX and > srvr (Neha Narkhede via camille) > > > > Release 3.4.0 - > > > > > > Modified: > zookeeper/trunk/src/java/main/org/apache/zookeeper/server/NIOServerCnxn.java > > URL: > http://svn.apache.org/viewvc/zookeeper/trunk/src/java/main/org/apache/zookeeper/server/NIOServerCnxn.java?rev=1225200&r1=1225199&r2=1225200&view=diff > > > =============================================================================> > --- > zookeeper/trunk/src/java/main/org/apache/zookeeper/server/NIOServerCnxn.java > (original) > > +++ > zookeeper/trunk/src/java/main/org/apache/zookeeper/server/NIOServerCnxn.java > Wed Dec 28 14:55:37 2011 > > @@ -749,7 +749,8 @@ public class NIOServerCnxn extends Serve > > > > print("packets_received", stats.getPacketsReceived()); > > print("packets_sent", stats.getPacketsSent()); > > - > > + print("num_alive_connections", > stats.getNumAliveClientConnections()); > > + > > print("outstanding_requests", > stats.getOutstandingRequests()); > > > > print("server_state", stats.getServerState()); > > > > Modified: > zookeeper/trunk/src/java/main/org/apache/zookeeper/server/NIOServerCnxnFactory.java > > URL: > http://svn.apache.org/viewvc/zookeeper/trunk/src/java/main/org/apache/zookeeper/server/NIOServerCnxnFactory.java?rev=1225200&r1=1225199&r2=1225200&view=diff > > > =============================================================================> > --- +
Camille Fournier 2011-12-28, 18:12
-
Re: svn commit: r1225200 - in /zookeeper/trunk: ./ src/java/main/org/apache/zookeeper/server/ src/java/test/org/apache/zookeeper/server/quorum/ src/java/test/org/apache/zookeeper/test/Camille Fournier 2011-12-28, 18:31
After looking for a few mins, here are my observations:
The implementation of HashSet in jdk 1.6 uses an underlying HashMap, which uses a bog-standard int for the size. So, I completely agree that we can get invalid results for a point in time. But does anyone really care about the *exact* moment-in-time number of cnxns in the system, in a system where cnxns are coming and going? At best you'll see a value that will likely be out of date the moment you read it. But, if we're going to do this, I do think there may be a case for refactoring to a concurrent data structure given that we are randomly syncing on the cnxns set in a way that makes adding additional interactions with it error prone in this way. And we should definitely remove that getConnections method, if anyone ever iterated over that set they would be in a sad concurrency situation. Seems possibly worth a whole new ticket to do this. What do you think? Thanks, C On Wed, Dec 28, 2011 at 1:12 PM, Camille Fournier <[EMAIL PROTECTED]>wrote: > I'm not sure it's meaningful enough to be worth the sync overhead. We > should look into that. The alternative if we must is using a proper > concurrent collection. Neha, any thoughts? > > C > On Dec 28, 2011 12:45 PM, "Patrick Hunt" <[EMAIL PROTECTED]> wrote: > >> I believe there is a bug in this commit. The "cnxns" size() call is >> not being synchronized. This will lead to invalid results at best, at >> worst outright failure (hard to say w/o knowing the implementation of >> HashSet). >> >> Camille can you work with Neha to get this fixed? Perhaps in the >> meantime (if it's going to take a while) you can revert this change, >> re-open the jira, update the patch, and reapply at some later time? >> >> Patrick >> >> On Wed, Dec 28, 2011 at 6:55 AM, <[EMAIL PROTECTED]> wrote: >> > Author: camille >> > Date: Wed Dec 28 14:55:37 2011 >> > New Revision: 1225200 >> > >> > URL: http://svn.apache.org/viewvc?rev=1225200&view=rev >> > Log: >> > ZOOKEEPER-1321: Add number of client connections metric in JMX and srvr >> (Neha Narkhede via camille) >> > >> > Modified: >> > zookeeper/trunk/CHANGES.txt >> > >> zookeeper/trunk/src/java/main/org/apache/zookeeper/server/NIOServerCnxn.java >> > >> zookeeper/trunk/src/java/main/org/apache/zookeeper/server/NIOServerCnxnFactory.java >> > >> zookeeper/trunk/src/java/main/org/apache/zookeeper/server/NettyServerCnxn.java >> > >> zookeeper/trunk/src/java/main/org/apache/zookeeper/server/NettyServerCnxnFactory.java >> > >> zookeeper/trunk/src/java/main/org/apache/zookeeper/server/ServerCnxnFactory.java >> > >> zookeeper/trunk/src/java/main/org/apache/zookeeper/server/ServerStats.java >> > >> zookeeper/trunk/src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java >> > >> zookeeper/trunk/src/java/main/org/apache/zookeeper/server/ZooKeeperServerBean.java >> > >> zookeeper/trunk/src/java/main/org/apache/zookeeper/server/ZooKeeperServerMXBean.java >> > >> zookeeper/trunk/src/java/test/org/apache/zookeeper/server/quorum/Zab1_0Test.java >> > >> zookeeper/trunk/src/java/test/org/apache/zookeeper/test/FourLetterWordsTest.java >> > >> > Modified: zookeeper/trunk/CHANGES.txt >> > URL: >> http://svn.apache.org/viewvc/zookeeper/trunk/CHANGES.txt?rev=1225200&r1=1225199&r2=1225200&view=diff >> > >> =============================================================================>> > --- zookeeper/trunk/CHANGES.txt (original) >> > +++ zookeeper/trunk/CHANGES.txt Wed Dec 28 14:55:37 2011 >> > @@ -162,6 +162,8 @@ IMPROVEMENTS: >> > >> > ZOOKEEPER-1342. quorum Listener & LearnerCnxAcceptor are missing >> > thread names (Rakesh R via phunt) >> > + >> > + ZOOKEEPER-1321. Add number of client connections metric in JMX and >> srvr (Neha Narkhede via camille) >> > >> > Release 3.4.0 - >> > >> > >> > Modified: >> zookeeper/trunk/src/java/main/org/apache/zookeeper/server/NIOServerCnxn.java >> > URL: >> http://svn.apache.org/viewvc/zookeeper/trunk/src/java/main/org/apache/zookeeper/server/NIOServerCnxn.java?rev=1225200&r1=1225199&r2=1225200&view=diff +
Camille Fournier 2011-12-28, 18:31
-
Re: svn commit: r1225200 - in /zookeeper/trunk: ./ src/java/main/org/apache/zookeeper/server/ src/java/test/org/apache/zookeeper/server/quorum/ src/java/test/org/apache/zookeeper/test/Camille Fournier 2011-12-28, 19:18
Also, Pat, the JIRA is still open. I will look to revert in a minute
although I think the sync changes may necessitate a new JIRA entirely. On Wed, Dec 28, 2011 at 1:31 PM, Camille Fournier <[EMAIL PROTECTED]> wrote: > After looking for a few mins, here are my observations: > > The implementation of HashSet in jdk 1.6 uses an underlying HashMap, which > uses a bog-standard int for the size. So, I completely agree that we can > get invalid results for a point in time. But does anyone really care about > the *exact* moment-in-time number of cnxns in the system, in a system where > cnxns are coming and going? At best you'll see a value that will likely be > out of date the moment you read it. > > But, if we're going to do this, I do think there may be a case for > refactoring to a concurrent data structure given that we are randomly > syncing on the cnxns set in a way that makes adding additional interactions > with it error prone in this way. And we should definitely remove that > getConnections method, if anyone ever iterated over that set they would be > in a sad concurrency situation. > Seems possibly worth a whole new ticket to do this. What do you think? > > Thanks, > C > > On Wed, Dec 28, 2011 at 1:12 PM, Camille Fournier <[EMAIL PROTECTED]>wrote: > >> I'm not sure it's meaningful enough to be worth the sync overhead. We >> should look into that. The alternative if we must is using a proper >> concurrent collection. Neha, any thoughts? >> >> C >> On Dec 28, 2011 12:45 PM, "Patrick Hunt" <[EMAIL PROTECTED]> wrote: >> >>> I believe there is a bug in this commit. The "cnxns" size() call is >>> not being synchronized. This will lead to invalid results at best, at >>> worst outright failure (hard to say w/o knowing the implementation of >>> HashSet). >>> >>> Camille can you work with Neha to get this fixed? Perhaps in the >>> meantime (if it's going to take a while) you can revert this change, >>> re-open the jira, update the patch, and reapply at some later time? >>> >>> Patrick >>> >>> On Wed, Dec 28, 2011 at 6:55 AM, <[EMAIL PROTECTED]> wrote: >>> > Author: camille >>> > Date: Wed Dec 28 14:55:37 2011 >>> > New Revision: 1225200 >>> > >>> > URL: http://svn.apache.org/viewvc?rev=1225200&view=rev >>> > Log: >>> > ZOOKEEPER-1321: Add number of client connections metric in JMX and srvr >>> (Neha Narkhede via camille) >>> > >>> > Modified: >>> > zookeeper/trunk/CHANGES.txt >>> > >>> zookeeper/trunk/src/java/main/org/apache/zookeeper/server/NIOServerCnxn.java >>> > >>> zookeeper/trunk/src/java/main/org/apache/zookeeper/server/NIOServerCnxnFactory.java >>> > >>> zookeeper/trunk/src/java/main/org/apache/zookeeper/server/NettyServerCnxn.java >>> > >>> zookeeper/trunk/src/java/main/org/apache/zookeeper/server/NettyServerCnxnFactory.java >>> > >>> zookeeper/trunk/src/java/main/org/apache/zookeeper/server/ServerCnxnFactory.java >>> > >>> zookeeper/trunk/src/java/main/org/apache/zookeeper/server/ServerStats.java >>> > >>> zookeeper/trunk/src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java >>> > >>> zookeeper/trunk/src/java/main/org/apache/zookeeper/server/ZooKeeperServerBean.java >>> > >>> zookeeper/trunk/src/java/main/org/apache/zookeeper/server/ZooKeeperServerMXBean.java >>> > >>> zookeeper/trunk/src/java/test/org/apache/zookeeper/server/quorum/Zab1_0Test.java >>> > >>> zookeeper/trunk/src/java/test/org/apache/zookeeper/test/FourLetterWordsTest.java >>> > >>> > Modified: zookeeper/trunk/CHANGES.txt >>> > URL: >>> http://svn.apache.org/viewvc/zookeeper/trunk/CHANGES.txt?rev=1225200&r1=1225199&r2=1225200&view=diff >>> > >>> =============================================================================>>> > --- zookeeper/trunk/CHANGES.txt (original) >>> > +++ zookeeper/trunk/CHANGES.txt Wed Dec 28 14:55:37 2011 >>> > @@ -162,6 +162,8 @@ IMPROVEMENTS: >>> > >>> > ZOOKEEPER-1342. quorum Listener & LearnerCnxAcceptor are missing >>> > thread names (Rakesh R via phunt) >>> > + >>> > + ZOOKEEPER-1321. Add number of client connections metric in JMX and +
Camille Fournier 2011-12-28, 19:18
-
Re: svn commit: r1225200 - in /zookeeper/trunk: ./ src/java/main/org/apache/zookeeper/server/ src/java/test/org/apache/zookeeper/server/quorum/ src/java/test/org/apache/zookeeper/test/Patrick Hunt 2011-12-28, 19:24
I think it needs to be fixed. It's obviously incorrect. Also if
someone changes the underlying implementation at some point it might bite them. Given it wasn't applied to 3.4 branch yet I'd suggest revert, fix the patch, then reapply to both branches. (that's easiest/bulletproof imo) I'd recommend a new ticket to fix the issues you identified (refactor, etc...) Neha was looking for further items to work on, this would be a good one. Patrick On Wed, Dec 28, 2011 at 11:18 AM, Camille Fournier <[EMAIL PROTECTED]> wrote: > Also, Pat, the JIRA is still open. I will look to revert in a minute > although I think the sync changes may necessitate a new JIRA entirely. > > On Wed, Dec 28, 2011 at 1:31 PM, Camille Fournier <[EMAIL PROTECTED]> wrote: >> After looking for a few mins, here are my observations: >> >> The implementation of HashSet in jdk 1.6 uses an underlying HashMap, which >> uses a bog-standard int for the size. So, I completely agree that we can >> get invalid results for a point in time. But does anyone really care about >> the *exact* moment-in-time number of cnxns in the system, in a system where >> cnxns are coming and going? At best you'll see a value that will likely be >> out of date the moment you read it. >> >> But, if we're going to do this, I do think there may be a case for >> refactoring to a concurrent data structure given that we are randomly >> syncing on the cnxns set in a way that makes adding additional interactions >> with it error prone in this way. And we should definitely remove that >> getConnections method, if anyone ever iterated over that set they would be >> in a sad concurrency situation. >> Seems possibly worth a whole new ticket to do this. What do you think? >> >> Thanks, >> C >> >> On Wed, Dec 28, 2011 at 1:12 PM, Camille Fournier <[EMAIL PROTECTED]>wrote: >> >>> I'm not sure it's meaningful enough to be worth the sync overhead. We >>> should look into that. The alternative if we must is using a proper >>> concurrent collection. Neha, any thoughts? >>> >>> C >>> On Dec 28, 2011 12:45 PM, "Patrick Hunt" <[EMAIL PROTECTED]> wrote: >>> >>>> I believe there is a bug in this commit. The "cnxns" size() call is >>>> not being synchronized. This will lead to invalid results at best, at >>>> worst outright failure (hard to say w/o knowing the implementation of >>>> HashSet). >>>> >>>> Camille can you work with Neha to get this fixed? Perhaps in the >>>> meantime (if it's going to take a while) you can revert this change, >>>> re-open the jira, update the patch, and reapply at some later time? >>>> >>>> Patrick >>>> >>>> On Wed, Dec 28, 2011 at 6:55 AM, <[EMAIL PROTECTED]> wrote: >>>> > Author: camille >>>> > Date: Wed Dec 28 14:55:37 2011 >>>> > New Revision: 1225200 >>>> > >>>> > URL: http://svn.apache.org/viewvc?rev=1225200&view=rev >>>> > Log: >>>> > ZOOKEEPER-1321: Add number of client connections metric in JMX and srvr >>>> (Neha Narkhede via camille) >>>> > >>>> > Modified: >>>> > zookeeper/trunk/CHANGES.txt >>>> > >>>> zookeeper/trunk/src/java/main/org/apache/zookeeper/server/NIOServerCnxn.java >>>> > >>>> zookeeper/trunk/src/java/main/org/apache/zookeeper/server/NIOServerCnxnFactory.java >>>> > >>>> zookeeper/trunk/src/java/main/org/apache/zookeeper/server/NettyServerCnxn.java >>>> > >>>> zookeeper/trunk/src/java/main/org/apache/zookeeper/server/NettyServerCnxnFactory.java >>>> > >>>> zookeeper/trunk/src/java/main/org/apache/zookeeper/server/ServerCnxnFactory.java >>>> > >>>> zookeeper/trunk/src/java/main/org/apache/zookeeper/server/ServerStats.java >>>> > >>>> zookeeper/trunk/src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java >>>> > >>>> zookeeper/trunk/src/java/main/org/apache/zookeeper/server/ZooKeeperServerBean.java >>>> > >>>> zookeeper/trunk/src/java/main/org/apache/zookeeper/server/ZooKeeperServerMXBean.java >>>> > >>>> zookeeper/trunk/src/java/test/org/apache/zookeeper/server/quorum/Zab1_0Test.java >>>> > >>>> zookeeper/trunk/src/java/test/org/apache/zookeeper/test/FourLetterWordsTest.java +
Patrick Hunt 2011-12-28, 19:24
-
Re: svn commit: r1225200 - in /zookeeper/trunk: ./ src/java/main/org/apache/zookeeper/server/ src/java/test/org/apache/zookeeper/server/quorum/ src/java/test/org/apache/zookeeper/test/Ted Yu 2011-12-28, 20:58
w.r.t. getConnections() method, I think we can keep it - just clone a new
Set and return the clone. In javadoc, we document that getConnections() incurs additional cost in cloning. Cheers On Wed, Dec 28, 2011 at 11:24 AM, Patrick Hunt <[EMAIL PROTECTED]> wrote: > I think it needs to be fixed. It's obviously incorrect. Also if > someone changes the underlying implementation at some point it might > bite them. > > Given it wasn't applied to 3.4 branch yet I'd suggest revert, fix the > patch, then reapply to both branches. (that's easiest/bulletproof imo) > > I'd recommend a new ticket to fix the issues you identified (refactor, > etc...) Neha was looking for further items to work on, this would be a > good one. > > Patrick > > On Wed, Dec 28, 2011 at 11:18 AM, Camille Fournier <[EMAIL PROTECTED]> > wrote: > > Also, Pat, the JIRA is still open. I will look to revert in a minute > > although I think the sync changes may necessitate a new JIRA entirely. > > > > On Wed, Dec 28, 2011 at 1:31 PM, Camille Fournier <[EMAIL PROTECTED]> > wrote: > >> After looking for a few mins, here are my observations: > >> > >> The implementation of HashSet in jdk 1.6 uses an underlying HashMap, > which > >> uses a bog-standard int for the size. So, I completely agree that we can > >> get invalid results for a point in time. But does anyone really care > about > >> the *exact* moment-in-time number of cnxns in the system, in a system > where > >> cnxns are coming and going? At best you'll see a value that will likely > be > >> out of date the moment you read it. > >> > >> But, if we're going to do this, I do think there may be a case for > >> refactoring to a concurrent data structure given that we are randomly > >> syncing on the cnxns set in a way that makes adding additional > interactions > >> with it error prone in this way. And we should definitely remove that > >> getConnections method, if anyone ever iterated over that set they would > be > >> in a sad concurrency situation. > >> Seems possibly worth a whole new ticket to do this. What do you think? > >> > >> Thanks, > >> C > >> > >> On Wed, Dec 28, 2011 at 1:12 PM, Camille Fournier <[EMAIL PROTECTED] > >wrote: > >> > >>> I'm not sure it's meaningful enough to be worth the sync overhead. We > >>> should look into that. The alternative if we must is using a proper > >>> concurrent collection. Neha, any thoughts? > >>> > >>> C > >>> On Dec 28, 2011 12:45 PM, "Patrick Hunt" <[EMAIL PROTECTED]> wrote: > >>> > >>>> I believe there is a bug in this commit. The "cnxns" size() call is > >>>> not being synchronized. This will lead to invalid results at best, at > >>>> worst outright failure (hard to say w/o knowing the implementation of > >>>> HashSet). > >>>> > >>>> Camille can you work with Neha to get this fixed? Perhaps in the > >>>> meantime (if it's going to take a while) you can revert this change, > >>>> re-open the jira, update the patch, and reapply at some later time? > >>>> > >>>> Patrick > >>>> > >>>> On Wed, Dec 28, 2011 at 6:55 AM, <[EMAIL PROTECTED]> wrote: > >>>> > Author: camille > >>>> > Date: Wed Dec 28 14:55:37 2011 > >>>> > New Revision: 1225200 > >>>> > > >>>> > URL: http://svn.apache.org/viewvc?rev=1225200&view=rev > >>>> > Log: > >>>> > ZOOKEEPER-1321: Add number of client connections metric in JMX and > srvr > >>>> (Neha Narkhede via camille) > >>>> > > >>>> > Modified: > >>>> > zookeeper/trunk/CHANGES.txt > >>>> > > >>>> > zookeeper/trunk/src/java/main/org/apache/zookeeper/server/NIOServerCnxn.java > >>>> > > >>>> > zookeeper/trunk/src/java/main/org/apache/zookeeper/server/NIOServerCnxnFactory.java > >>>> > > >>>> > zookeeper/trunk/src/java/main/org/apache/zookeeper/server/NettyServerCnxn.java > >>>> > > >>>> > zookeeper/trunk/src/java/main/org/apache/zookeeper/server/NettyServerCnxnFactory.java > >>>> > > >>>> > zookeeper/trunk/src/java/main/org/apache/zookeeper/server/ServerCnxnFactory.java > >>>> > > >>>> > zookeeper/trunk/src/java/main/org/apache/zookeeper/server/ServerStats.java +
Ted Yu 2011-12-28, 20:58
-
Re: svn commit: r1225200 - in /zookeeper/trunk: ./ src/java/main/org/apache/zookeeper/server/ src/java/test/org/apache/zookeeper/server/quorum/ src/java/test/org/apache/zookeeper/test/Patrick Hunt 2011-12-28, 21:04
On Wed, Dec 28, 2011 at 12:58 PM, Ted Yu <[EMAIL PROTECTED]> wrote:
> w.r.t. getConnections() method, I think we can keep it - just clone a new > Set and return the clone. > In javadoc, we document that getConnections() incurs additional cost in > cloning. Sounds reasonable but given there are no callers of this function why keep it? I'd still rather we remove it given it's dead code. Perhaps there are locations where we should be using and aren't? > > On Wed, Dec 28, 2011 at 11:24 AM, Patrick Hunt <[EMAIL PROTECTED]> wrote: > >> I think it needs to be fixed. It's obviously incorrect. Also if >> someone changes the underlying implementation at some point it might >> bite them. >> >> Given it wasn't applied to 3.4 branch yet I'd suggest revert, fix the >> patch, then reapply to both branches. (that's easiest/bulletproof imo) >> >> I'd recommend a new ticket to fix the issues you identified (refactor, >> etc...) Neha was looking for further items to work on, this would be a >> good one. >> >> Patrick >> >> On Wed, Dec 28, 2011 at 11:18 AM, Camille Fournier <[EMAIL PROTECTED]> >> wrote: >> > Also, Pat, the JIRA is still open. I will look to revert in a minute >> > although I think the sync changes may necessitate a new JIRA entirely. >> > >> > On Wed, Dec 28, 2011 at 1:31 PM, Camille Fournier <[EMAIL PROTECTED]> >> wrote: >> >> After looking for a few mins, here are my observations: >> >> >> >> The implementation of HashSet in jdk 1.6 uses an underlying HashMap, >> which >> >> uses a bog-standard int for the size. So, I completely agree that we can >> >> get invalid results for a point in time. But does anyone really care >> about >> >> the *exact* moment-in-time number of cnxns in the system, in a system >> where >> >> cnxns are coming and going? At best you'll see a value that will likely >> be >> >> out of date the moment you read it. >> >> >> >> But, if we're going to do this, I do think there may be a case for >> >> refactoring to a concurrent data structure given that we are randomly >> >> syncing on the cnxns set in a way that makes adding additional >> interactions >> >> with it error prone in this way. And we should definitely remove that >> >> getConnections method, if anyone ever iterated over that set they would >> be >> >> in a sad concurrency situation. >> >> Seems possibly worth a whole new ticket to do this. What do you think? >> >> >> >> Thanks, >> >> C >> >> >> >> On Wed, Dec 28, 2011 at 1:12 PM, Camille Fournier <[EMAIL PROTECTED] >> >wrote: >> >> >> >>> I'm not sure it's meaningful enough to be worth the sync overhead. We >> >>> should look into that. The alternative if we must is using a proper >> >>> concurrent collection. Neha, any thoughts? >> >>> >> >>> C >> >>> On Dec 28, 2011 12:45 PM, "Patrick Hunt" <[EMAIL PROTECTED]> wrote: >> >>> >> >>>> I believe there is a bug in this commit. The "cnxns" size() call is >> >>>> not being synchronized. This will lead to invalid results at best, at >> >>>> worst outright failure (hard to say w/o knowing the implementation of >> >>>> HashSet). >> >>>> >> >>>> Camille can you work with Neha to get this fixed? Perhaps in the >> >>>> meantime (if it's going to take a while) you can revert this change, >> >>>> re-open the jira, update the patch, and reapply at some later time? >> >>>> >> >>>> Patrick >> >>>> >> >>>> On Wed, Dec 28, 2011 at 6:55 AM, <[EMAIL PROTECTED]> wrote: >> >>>> > Author: camille >> >>>> > Date: Wed Dec 28 14:55:37 2011 >> >>>> > New Revision: 1225200 >> >>>> > >> >>>> > URL: http://svn.apache.org/viewvc?rev=1225200&view=rev >> >>>> > Log: >> >>>> > ZOOKEEPER-1321: Add number of client connections metric in JMX and >> srvr >> >>>> (Neha Narkhede via camille) >> >>>> > >> >>>> > Modified: >> >>>> > zookeeper/trunk/CHANGES.txt >> >>>> > >> >>>> >> zookeeper/trunk/src/java/main/org/apache/zookeeper/server/NIOServerCnxn.java >> >>>> > >> >>>> >> zookeeper/trunk/src/java/main/org/apache/zookeeper/server/NIOServerCnxnFactory.java >> >>>> > >> >> +
Patrick Hunt 2011-12-28, 21:04
-
Re: svn commit: r1225200 - in /zookeeper/trunk: ./ src/java/main/org/apache/zookeeper/server/ src/java/test/org/apache/zookeeper/server/quorum/ src/java/test/org/apache/zookeeper/test/Camille Fournier 2011-12-28, 21:10
I don't think creating it as a flat sync is a good idea, so if we want this
I think we need to refactor that structure to be concurrent. C On Dec 28, 2011 2:24 PM, "Patrick Hunt" <[EMAIL PROTECTED]> wrote: > I think it needs to be fixed. It's obviously incorrect. Also if > someone changes the underlying implementation at some point it might > bite them. > > Given it wasn't applied to 3.4 branch yet I'd suggest revert, fix the > patch, then reapply to both branches. (that's easiest/bulletproof imo) > > I'd recommend a new ticket to fix the issues you identified (refactor, > etc...) Neha was looking for further items to work on, this would be a > good one. > > Patrick > > On Wed, Dec 28, 2011 at 11:18 AM, Camille Fournier <[EMAIL PROTECTED]> > wrote: > > Also, Pat, the JIRA is still open. I will look to revert in a minute > > although I think the sync changes may necessitate a new JIRA entirely. > > > > On Wed, Dec 28, 2011 at 1:31 PM, Camille Fournier <[EMAIL PROTECTED]> > wrote: > >> After looking for a few mins, here are my observations: > >> > >> The implementation of HashSet in jdk 1.6 uses an underlying HashMap, > which > >> uses a bog-standard int for the size. So, I completely agree that we can > >> get invalid results for a point in time. But does anyone really care > about > >> the *exact* moment-in-time number of cnxns in the system, in a system > where > >> cnxns are coming and going? At best you'll see a value that will likely > be > >> out of date the moment you read it. > >> > >> But, if we're going to do this, I do think there may be a case for > >> refactoring to a concurrent data structure given that we are randomly > >> syncing on the cnxns set in a way that makes adding additional > interactions > >> with it error prone in this way. And we should definitely remove that > >> getConnections method, if anyone ever iterated over that set they would > be > >> in a sad concurrency situation. > >> Seems possibly worth a whole new ticket to do this. What do you think? > >> > >> Thanks, > >> C > >> > >> On Wed, Dec 28, 2011 at 1:12 PM, Camille Fournier <[EMAIL PROTECTED] > >wrote: > >> > >>> I'm not sure it's meaningful enough to be worth the sync overhead. We > >>> should look into that. The alternative if we must is using a proper > >>> concurrent collection. Neha, any thoughts? > >>> > >>> C > >>> On Dec 28, 2011 12:45 PM, "Patrick Hunt" <[EMAIL PROTECTED]> wrote: > >>> > >>>> I believe there is a bug in this commit. The "cnxns" size() call is > >>>> not being synchronized. This will lead to invalid results at best, at > >>>> worst outright failure (hard to say w/o knowing the implementation of > >>>> HashSet). > >>>> > >>>> Camille can you work with Neha to get this fixed? Perhaps in the > >>>> meantime (if it's going to take a while) you can revert this change, > >>>> re-open the jira, update the patch, and reapply at some later time? > >>>> > >>>> Patrick > >>>> > >>>> On Wed, Dec 28, 2011 at 6:55 AM, <[EMAIL PROTECTED]> wrote: > >>>> > Author: camille > >>>> > Date: Wed Dec 28 14:55:37 2011 > >>>> > New Revision: 1225200 > >>>> > > >>>> > URL: http://svn.apache.org/viewvc?rev=1225200&view=rev > >>>> > Log: > >>>> > ZOOKEEPER-1321: Add number of client connections metric in JMX and > srvr > >>>> (Neha Narkhede via camille) > >>>> > > >>>> > Modified: > >>>> > zookeeper/trunk/CHANGES.txt > >>>> > > >>>> > zookeeper/trunk/src/java/main/org/apache/zookeeper/server/NIOServerCnxn.java > >>>> > > >>>> > zookeeper/trunk/src/java/main/org/apache/zookeeper/server/NIOServerCnxnFactory.java > >>>> > > >>>> > zookeeper/trunk/src/java/main/org/apache/zookeeper/server/NettyServerCnxn.java > >>>> > > >>>> > zookeeper/trunk/src/java/main/org/apache/zookeeper/server/NettyServerCnxnFactory.java > >>>> > > >>>> > zookeeper/trunk/src/java/main/org/apache/zookeeper/server/ServerCnxnFactory.java > >>>> > > >>>> > zookeeper/trunk/src/java/main/org/apache/zookeeper/server/ServerStats.java > >>>> > > >>>> > zookeeper/trunk/src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java +
Camille Fournier 2011-12-28, 21:10
-
Re: svn commit: r1225200 - in /zookeeper/trunk: ./ src/java/main/org/apache/zookeeper/server/ src/java/test/org/apache/zookeeper/server/quorum/ src/java/test/org/apache/zookeeper/test/Patrick Hunt 2011-12-28, 21:24
They seem like distinct changes to me. In particular getting the size
is going to happen infrequently (monitoring pull) so I don't see a problem fixing the existing patch in the same way the code currently handles cnxns access, with a separate jira to do the refactoring. Am I missing something? Patrick On Wed, Dec 28, 2011 at 1:10 PM, Camille Fournier <[EMAIL PROTECTED]> wrote: > I don't think creating it as a flat sync is a good idea, so if we want this > I think we need to refactor that structure to be concurrent. > > C > On Dec 28, 2011 2:24 PM, "Patrick Hunt" <[EMAIL PROTECTED]> wrote: > >> I think it needs to be fixed. It's obviously incorrect. Also if >> someone changes the underlying implementation at some point it might >> bite them. >> >> Given it wasn't applied to 3.4 branch yet I'd suggest revert, fix the >> patch, then reapply to both branches. (that's easiest/bulletproof imo) >> >> I'd recommend a new ticket to fix the issues you identified (refactor, >> etc...) Neha was looking for further items to work on, this would be a >> good one. >> >> Patrick >> >> On Wed, Dec 28, 2011 at 11:18 AM, Camille Fournier <[EMAIL PROTECTED]> >> wrote: >> > Also, Pat, the JIRA is still open. I will look to revert in a minute >> > although I think the sync changes may necessitate a new JIRA entirely. >> > >> > On Wed, Dec 28, 2011 at 1:31 PM, Camille Fournier <[EMAIL PROTECTED]> >> wrote: >> >> After looking for a few mins, here are my observations: >> >> >> >> The implementation of HashSet in jdk 1.6 uses an underlying HashMap, >> which >> >> uses a bog-standard int for the size. So, I completely agree that we can >> >> get invalid results for a point in time. But does anyone really care >> about >> >> the *exact* moment-in-time number of cnxns in the system, in a system >> where >> >> cnxns are coming and going? At best you'll see a value that will likely >> be >> >> out of date the moment you read it. >> >> >> >> But, if we're going to do this, I do think there may be a case for >> >> refactoring to a concurrent data structure given that we are randomly >> >> syncing on the cnxns set in a way that makes adding additional >> interactions >> >> with it error prone in this way. And we should definitely remove that >> >> getConnections method, if anyone ever iterated over that set they would >> be >> >> in a sad concurrency situation. >> >> Seems possibly worth a whole new ticket to do this. What do you think? >> >> >> >> Thanks, >> >> C >> >> >> >> On Wed, Dec 28, 2011 at 1:12 PM, Camille Fournier <[EMAIL PROTECTED] >> >wrote: >> >> >> >>> I'm not sure it's meaningful enough to be worth the sync overhead. We >> >>> should look into that. The alternative if we must is using a proper >> >>> concurrent collection. Neha, any thoughts? >> >>> >> >>> C >> >>> On Dec 28, 2011 12:45 PM, "Patrick Hunt" <[EMAIL PROTECTED]> wrote: >> >>> >> >>>> I believe there is a bug in this commit. The "cnxns" size() call is >> >>>> not being synchronized. This will lead to invalid results at best, at >> >>>> worst outright failure (hard to say w/o knowing the implementation of >> >>>> HashSet). >> >>>> >> >>>> Camille can you work with Neha to get this fixed? Perhaps in the >> >>>> meantime (if it's going to take a while) you can revert this change, >> >>>> re-open the jira, update the patch, and reapply at some later time? >> >>>> >> >>>> Patrick >> >>>> >> >>>> On Wed, Dec 28, 2011 at 6:55 AM, <[EMAIL PROTECTED]> wrote: >> >>>> > Author: camille >> >>>> > Date: Wed Dec 28 14:55:37 2011 >> >>>> > New Revision: 1225200 >> >>>> > >> >>>> > URL: http://svn.apache.org/viewvc?rev=1225200&view=rev >> >>>> > Log: >> >>>> > ZOOKEEPER-1321: Add number of client connections metric in JMX and >> srvr >> >>>> (Neha Narkhede via camille) >> >>>> > >> >>>> > Modified: >> >>>> > zookeeper/trunk/CHANGES.txt >> >>>> > >> >>>> >> zookeeper/trunk/src/java/main/org/apache/zookeeper/server/NIOServerCnxn.java >> >>>> > >> >>>> >> zookeeper/trunk/src/java/main/org/apache/zookeeper/server/NIOServerCnxnFactory.java +
Patrick Hunt 2011-12-28, 21:24
-
Re: svn commit: r1225200 - in /zookeeper/trunk: ./ src/java/main/org/apache/zookeeper/server/ src/java/test/org/apache/zookeeper/server/quorum/ src/java/test/org/apache/zookeeper/test/Camille Fournier 2011-12-28, 21:26
I'm not sure I agree in the assumption that monitoring pulls happen
infrequently... On Dec 28, 2011 4:24 PM, "Patrick Hunt" <[EMAIL PROTECTED]> wrote: > They seem like distinct changes to me. In particular getting the size > is going to happen infrequently (monitoring pull) so I don't see a > problem fixing the existing patch in the same way the code currently > handles cnxns access, with a separate jira to do the refactoring. Am I > missing something? > > Patrick > > On Wed, Dec 28, 2011 at 1:10 PM, Camille Fournier <[EMAIL PROTECTED]> > wrote: > > I don't think creating it as a flat sync is a good idea, so if we want > this > > I think we need to refactor that structure to be concurrent. > > > > C > > On Dec 28, 2011 2:24 PM, "Patrick Hunt" <[EMAIL PROTECTED]> wrote: > > > >> I think it needs to be fixed. It's obviously incorrect. Also if > >> someone changes the underlying implementation at some point it might > >> bite them. > >> > >> Given it wasn't applied to 3.4 branch yet I'd suggest revert, fix the > >> patch, then reapply to both branches. (that's easiest/bulletproof imo) > >> > >> I'd recommend a new ticket to fix the issues you identified (refactor, > >> etc...) Neha was looking for further items to work on, this would be a > >> good one. > >> > >> Patrick > >> > >> On Wed, Dec 28, 2011 at 11:18 AM, Camille Fournier <[EMAIL PROTECTED]> > >> wrote: > >> > Also, Pat, the JIRA is still open. I will look to revert in a minute > >> > although I think the sync changes may necessitate a new JIRA entirely. > >> > > >> > On Wed, Dec 28, 2011 at 1:31 PM, Camille Fournier <[EMAIL PROTECTED] > > > >> wrote: > >> >> After looking for a few mins, here are my observations: > >> >> > >> >> The implementation of HashSet in jdk 1.6 uses an underlying HashMap, > >> which > >> >> uses a bog-standard int for the size. So, I completely agree that we > can > >> >> get invalid results for a point in time. But does anyone really care > >> about > >> >> the *exact* moment-in-time number of cnxns in the system, in a system > >> where > >> >> cnxns are coming and going? At best you'll see a value that will > likely > >> be > >> >> out of date the moment you read it. > >> >> > >> >> But, if we're going to do this, I do think there may be a case for > >> >> refactoring to a concurrent data structure given that we are randomly > >> >> syncing on the cnxns set in a way that makes adding additional > >> interactions > >> >> with it error prone in this way. And we should definitely remove that > >> >> getConnections method, if anyone ever iterated over that set they > would > >> be > >> >> in a sad concurrency situation. > >> >> Seems possibly worth a whole new ticket to do this. What do you > think? > >> >> > >> >> Thanks, > >> >> C > >> >> > >> >> On Wed, Dec 28, 2011 at 1:12 PM, Camille Fournier < > [EMAIL PROTECTED] > >> >wrote: > >> >> > >> >>> I'm not sure it's meaningful enough to be worth the sync overhead. > We > >> >>> should look into that. The alternative if we must is using a proper > >> >>> concurrent collection. Neha, any thoughts? > >> >>> > >> >>> C > >> >>> On Dec 28, 2011 12:45 PM, "Patrick Hunt" <[EMAIL PROTECTED]> wrote: > >> >>> > >> >>>> I believe there is a bug in this commit. The "cnxns" size() call is > >> >>>> not being synchronized. This will lead to invalid results at best, > at > >> >>>> worst outright failure (hard to say w/o knowing the implementation > of > >> >>>> HashSet). > >> >>>> > >> >>>> Camille can you work with Neha to get this fixed? Perhaps in the > >> >>>> meantime (if it's going to take a while) you can revert this > change, > >> >>>> re-open the jira, update the patch, and reapply at some later time? > >> >>>> > >> >>>> Patrick > >> >>>> > >> >>>> On Wed, Dec 28, 2011 at 6:55 AM, <[EMAIL PROTECTED]> wrote: > >> >>>> > Author: camille > >> >>>> > Date: Wed Dec 28 14:55:37 2011 > >> >>>> > New Revision: 1225200 > >> >>>> > > >> >>>> > URL: http://svn.apache.org/viewvc?rev=1225200&view=rev > >> >>>> > Log: +
Camille Fournier 2011-12-28, 21:26
-
Re: svn commit: r1225200 - in /zookeeper/trunk: ./ src/java/main/org/apache/zookeeper/server/ src/java/test/org/apache/zookeeper/server/quorum/ src/java/test/org/apache/zookeeper/test/Patrick Hunt 2011-12-28, 21:32
On Wed, Dec 28, 2011 at 1:26 PM, Camille Fournier <[EMAIL PROTECTED]> wrote:
> I'm not sure I agree in the assumption that monitoring pulls happen > infrequently... Well that would be bad news then - see "stat" command processing and similar. All the more reason to look at this in more depth. Patrick > On Dec 28, 2011 4:24 PM, "Patrick Hunt" <[EMAIL PROTECTED]> wrote: > >> They seem like distinct changes to me. In particular getting the size >> is going to happen infrequently (monitoring pull) so I don't see a >> problem fixing the existing patch in the same way the code currently >> handles cnxns access, with a separate jira to do the refactoring. Am I >> missing something? >> >> Patrick >> >> On Wed, Dec 28, 2011 at 1:10 PM, Camille Fournier <[EMAIL PROTECTED]> >> wrote: >> > I don't think creating it as a flat sync is a good idea, so if we want >> this >> > I think we need to refactor that structure to be concurrent. >> > >> > C >> > On Dec 28, 2011 2:24 PM, "Patrick Hunt" <[EMAIL PROTECTED]> wrote: >> > >> >> I think it needs to be fixed. It's obviously incorrect. Also if >> >> someone changes the underlying implementation at some point it might >> >> bite them. >> >> >> >> Given it wasn't applied to 3.4 branch yet I'd suggest revert, fix the >> >> patch, then reapply to both branches. (that's easiest/bulletproof imo) >> >> >> >> I'd recommend a new ticket to fix the issues you identified (refactor, >> >> etc...) Neha was looking for further items to work on, this would be a >> >> good one. >> >> >> >> Patrick >> >> >> >> On Wed, Dec 28, 2011 at 11:18 AM, Camille Fournier <[EMAIL PROTECTED]> >> >> wrote: >> >> > Also, Pat, the JIRA is still open. I will look to revert in a minute >> >> > although I think the sync changes may necessitate a new JIRA entirely. >> >> > >> >> > On Wed, Dec 28, 2011 at 1:31 PM, Camille Fournier <[EMAIL PROTECTED] >> > >> >> wrote: >> >> >> After looking for a few mins, here are my observations: >> >> >> >> >> >> The implementation of HashSet in jdk 1.6 uses an underlying HashMap, >> >> which >> >> >> uses a bog-standard int for the size. So, I completely agree that we >> can >> >> >> get invalid results for a point in time. But does anyone really care >> >> about >> >> >> the *exact* moment-in-time number of cnxns in the system, in a system >> >> where >> >> >> cnxns are coming and going? At best you'll see a value that will >> likely >> >> be >> >> >> out of date the moment you read it. >> >> >> >> >> >> But, if we're going to do this, I do think there may be a case for >> >> >> refactoring to a concurrent data structure given that we are randomly >> >> >> syncing on the cnxns set in a way that makes adding additional >> >> interactions >> >> >> with it error prone in this way. And we should definitely remove that >> >> >> getConnections method, if anyone ever iterated over that set they >> would >> >> be >> >> >> in a sad concurrency situation. >> >> >> Seems possibly worth a whole new ticket to do this. What do you >> think? >> >> >> >> >> >> Thanks, >> >> >> C >> >> >> >> >> >> On Wed, Dec 28, 2011 at 1:12 PM, Camille Fournier < >> [EMAIL PROTECTED] >> >> >wrote: >> >> >> >> >> >>> I'm not sure it's meaningful enough to be worth the sync overhead. >> We >> >> >>> should look into that. The alternative if we must is using a proper >> >> >>> concurrent collection. Neha, any thoughts? >> >> >>> >> >> >>> C >> >> >>> On Dec 28, 2011 12:45 PM, "Patrick Hunt" <[EMAIL PROTECTED]> wrote: >> >> >>> >> >> >>>> I believe there is a bug in this commit. The "cnxns" size() call is >> >> >>>> not being synchronized. This will lead to invalid results at best, >> at >> >> >>>> worst outright failure (hard to say w/o knowing the implementation >> of >> >> >>>> HashSet). >> >> >>>> >> >> >>>> Camille can you work with Neha to get this fixed? Perhaps in the >> >> >>>> meantime (if it's going to take a while) you can revert this >> change, >> >> >>>> re-open the jira, update the patch, and reapply at some later time? +
Patrick Hunt 2011-12-28, 21:32
-
Re: svn commit: r1225200 - in /zookeeper/trunk: ./ src/java/main/org/apache/zookeeper/server/ src/java/test/org/apache/zookeeper/server/quorum/ src/java/test/org/apache/zookeeper/test/Camille Fournier 2011-12-28, 21:53
Fair enough. Syncs for all! It's a drop in the bucket compared to the fact
that we aren't handling the fundamental socket connections for 4lws correctly anyway. On Wed, Dec 28, 2011 at 4:32 PM, Patrick Hunt <[EMAIL PROTECTED]> wrote: > On Wed, Dec 28, 2011 at 1:26 PM, Camille Fournier <[EMAIL PROTECTED]> > wrote: > > I'm not sure I agree in the assumption that monitoring pulls happen > > infrequently... > > Well that would be bad news then - see "stat" command processing and > similar. All the more reason to look at this in more depth. > > Patrick > > > > On Dec 28, 2011 4:24 PM, "Patrick Hunt" <[EMAIL PROTECTED]> wrote: > > > >> They seem like distinct changes to me. In particular getting the size > >> is going to happen infrequently (monitoring pull) so I don't see a > >> problem fixing the existing patch in the same way the code currently > >> handles cnxns access, with a separate jira to do the refactoring. Am I > >> missing something? > >> > >> Patrick > >> > >> On Wed, Dec 28, 2011 at 1:10 PM, Camille Fournier <[EMAIL PROTECTED]> > >> wrote: > >> > I don't think creating it as a flat sync is a good idea, so if we want > >> this > >> > I think we need to refactor that structure to be concurrent. > >> > > >> > C > >> > On Dec 28, 2011 2:24 PM, "Patrick Hunt" <[EMAIL PROTECTED]> wrote: > >> > > >> >> I think it needs to be fixed. It's obviously incorrect. Also if > >> >> someone changes the underlying implementation at some point it might > >> >> bite them. > >> >> > >> >> Given it wasn't applied to 3.4 branch yet I'd suggest revert, fix the > >> >> patch, then reapply to both branches. (that's easiest/bulletproof > imo) > >> >> > >> >> I'd recommend a new ticket to fix the issues you identified > (refactor, > >> >> etc...) Neha was looking for further items to work on, this would be > a > >> >> good one. > >> >> > >> >> Patrick > >> >> > >> >> On Wed, Dec 28, 2011 at 11:18 AM, Camille Fournier < > [EMAIL PROTECTED]> > >> >> wrote: > >> >> > Also, Pat, the JIRA is still open. I will look to revert in a > minute > >> >> > although I think the sync changes may necessitate a new JIRA > entirely. > >> >> > > >> >> > On Wed, Dec 28, 2011 at 1:31 PM, Camille Fournier < > [EMAIL PROTECTED] > >> > > >> >> wrote: > >> >> >> After looking for a few mins, here are my observations: > >> >> >> > >> >> >> The implementation of HashSet in jdk 1.6 uses an underlying > HashMap, > >> >> which > >> >> >> uses a bog-standard int for the size. So, I completely agree that > we > >> can > >> >> >> get invalid results for a point in time. But does anyone really > care > >> >> about > >> >> >> the *exact* moment-in-time number of cnxns in the system, in a > system > >> >> where > >> >> >> cnxns are coming and going? At best you'll see a value that will > >> likely > >> >> be > >> >> >> out of date the moment you read it. > >> >> >> > >> >> >> But, if we're going to do this, I do think there may be a case for > >> >> >> refactoring to a concurrent data structure given that we are > randomly > >> >> >> syncing on the cnxns set in a way that makes adding additional > >> >> interactions > >> >> >> with it error prone in this way. And we should definitely remove > that > >> >> >> getConnections method, if anyone ever iterated over that set they > >> would > >> >> be > >> >> >> in a sad concurrency situation. > >> >> >> Seems possibly worth a whole new ticket to do this. What do you > >> think? > >> >> >> > >> >> >> Thanks, > >> >> >> C > >> >> >> > >> >> >> On Wed, Dec 28, 2011 at 1:12 PM, Camille Fournier < > >> [EMAIL PROTECTED] > >> >> >wrote: > >> >> >> > >> >> >>> I'm not sure it's meaningful enough to be worth the sync > overhead. > >> We > >> >> >>> should look into that. The alternative if we must is using a > proper > >> >> >>> concurrent collection. Neha, any thoughts? > >> >> >>> > >> >> >>> C > >> >> >>> On Dec 28, 2011 12:45 PM, "Patrick Hunt" <[EMAIL PROTECTED]> > wrote: > >> >> >>> > >> >> >>>> I believe there is a bug in this commit. The "cnxns" size() +
Camille Fournier 2011-12-28, 21:53
-
Re: svn commit: r1225200 - in /zookeeper/trunk: ./ src/java/main/org/apache/zookeeper/server/ src/java/test/org/apache/zookeeper/server/quorum/ src/java/test/org/apache/zookeeper/test/Patrick Hunt 2011-12-28, 22:12
On Wed, Dec 28, 2011 at 1:53 PM, Camille Fournier <[EMAIL PROTECTED]> wrote:
> Fair enough. Syncs for all! It's a drop in the bucket compared to the fact > that we aren't handling the fundamental socket connections for 4lws > correctly anyway. sorry. I see a blog post coming on. ;-) You're referring to the following (any more?) https://issues.apache.org/jira/browse/ZOOKEEPER-1197 https://issues.apache.org/jira/browse/ZOOKEEPER-1237 is this comment still a correct interpretation of the issue? https://issues.apache.org/jira/browse/ZOOKEEPER-737?focusedCommentId=13109067&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13109067 What would you like to do about these? I had proposed having a separate 4lw port some time back in order to simplify/isolate. Should we do this for 3.5.0? (deprecate the old port usage but keep it around for a while) Something else? Patrick > On Wed, Dec 28, 2011 at 4:32 PM, Patrick Hunt <[EMAIL PROTECTED]> wrote: > >> On Wed, Dec 28, 2011 at 1:26 PM, Camille Fournier <[EMAIL PROTECTED]> >> wrote: >> > I'm not sure I agree in the assumption that monitoring pulls happen >> > infrequently... >> >> Well that would be bad news then - see "stat" command processing and >> similar. All the more reason to look at this in more depth. >> >> Patrick >> >> >> > On Dec 28, 2011 4:24 PM, "Patrick Hunt" <[EMAIL PROTECTED]> wrote: >> > >> >> They seem like distinct changes to me. In particular getting the size >> >> is going to happen infrequently (monitoring pull) so I don't see a >> >> problem fixing the existing patch in the same way the code currently >> >> handles cnxns access, with a separate jira to do the refactoring. Am I >> >> missing something? >> >> >> >> Patrick >> >> >> >> On Wed, Dec 28, 2011 at 1:10 PM, Camille Fournier <[EMAIL PROTECTED]> >> >> wrote: >> >> > I don't think creating it as a flat sync is a good idea, so if we want >> >> this >> >> > I think we need to refactor that structure to be concurrent. >> >> > >> >> > C >> >> > On Dec 28, 2011 2:24 PM, "Patrick Hunt" <[EMAIL PROTECTED]> wrote: >> >> > >> >> >> I think it needs to be fixed. It's obviously incorrect. Also if >> >> >> someone changes the underlying implementation at some point it might >> >> >> bite them. >> >> >> >> >> >> Given it wasn't applied to 3.4 branch yet I'd suggest revert, fix the >> >> >> patch, then reapply to both branches. (that's easiest/bulletproof >> imo) >> >> >> >> >> >> I'd recommend a new ticket to fix the issues you identified >> (refactor, >> >> >> etc...) Neha was looking for further items to work on, this would be >> a >> >> >> good one. >> >> >> >> >> >> Patrick >> >> >> >> >> >> On Wed, Dec 28, 2011 at 11:18 AM, Camille Fournier < >> [EMAIL PROTECTED]> >> >> >> wrote: >> >> >> > Also, Pat, the JIRA is still open. I will look to revert in a >> minute >> >> >> > although I think the sync changes may necessitate a new JIRA >> entirely. >> >> >> > >> >> >> > On Wed, Dec 28, 2011 at 1:31 PM, Camille Fournier < >> [EMAIL PROTECTED] >> >> > >> >> >> wrote: >> >> >> >> After looking for a few mins, here are my observations: >> >> >> >> >> >> >> >> The implementation of HashSet in jdk 1.6 uses an underlying >> HashMap, >> >> >> which >> >> >> >> uses a bog-standard int for the size. So, I completely agree that >> we >> >> can >> >> >> >> get invalid results for a point in time. But does anyone really >> care >> >> >> about >> >> >> >> the *exact* moment-in-time number of cnxns in the system, in a >> system >> >> >> where >> >> >> >> cnxns are coming and going? At best you'll see a value that will >> >> likely >> >> >> be >> >> >> >> out of date the moment you read it. >> >> >> >> >> >> >> >> But, if we're going to do this, I do think there may be a case for >> >> >> >> refactoring to a concurrent data structure given that we are >> randomly >> >> >> >> syncing on the cnxns set in a way that makes adding additional >> >> >> interactions >> >> >> >> with it error prone in this way. And we should definitely remove +
Patrick Hunt 2011-12-28, 22:12
-
Re: svn commit: r1225200 - in /zookeeper/trunk: ./ src/java/main/org/apache/zookeeper/server/ src/java/test/org/apache/zookeeper/server/quorum/ src/java/test/org/apache/zookeeper/test/Camille Fournier 2011-12-28, 22:21
:)
That is a very good question. This should definitely be fixed in 3.5. I think your general idea in that tracker comment is right on: One option that would make this a lot less ugly would be to deprecate > support for 4letterwords on the client port in 3.5.0 (dep, not remove, > although provide an option to turn off via config), we'd designate a new > port specifically for 4letterwords. We'd fix this problem on the new port, > the old port would have the existing limitations. > > 3.6.0 we would remove 4lw on client port entirely. > > This is good for a number of reasons IMO – one is that having 4lw on the > client port is a bit of a security issue in some customers - as any client > has access to this port and it cannot by definition be firewalled. Many > admins would like to firewall this. > > In the process we could: > 1) make the new port fully b/w compatible with the existing functionality > 2) enable long lived sessions rather than polling > 3) provide support for "extended command syntax" which would enhance 4lw > features (for example to control the format of the output) > 4) etc... > Especially the points 1-4... we need this to be a handshake model, not just a connect and dump model. I want to work on this, but I will probably need someone a bit better at python to rewrite your monitoring server to use it. I would also welcome help on this issue, if there's someone else is interested in tackling it, I am happy to provide feedback and guidance. Should I start clean and make a new tracker or fold it into 1197? C On Wed, Dec 28, 2011 at 5:12 PM, Patrick Hunt <[EMAIL PROTECTED]> wrote: > On Wed, Dec 28, 2011 at 1:53 PM, Camille Fournier <[EMAIL PROTECTED]> > wrote: > > Fair enough. Syncs for all! It's a drop in the bucket compared to the > fact > > that we aren't handling the fundamental socket connections for 4lws > > correctly anyway. > > sorry. I see a blog post coming on. ;-) > > You're referring to the following (any more?) > https://issues.apache.org/jira/browse/ZOOKEEPER-1197 > https://issues.apache.org/jira/browse/ZOOKEEPER-1237 > > is this comment still a correct interpretation of the issue? > > https://issues.apache.org/jira/browse/ZOOKEEPER-737?focusedCommentId=13109067&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13109067 > > What would you like to do about these? I had proposed having a > separate 4lw port some time back in order to simplify/isolate. Should > we do this for 3.5.0? (deprecate the old port usage but keep it around > for a while) Something else? > > Patrick > > > On Wed, Dec 28, 2011 at 4:32 PM, Patrick Hunt <[EMAIL PROTECTED]> wrote: > > > >> On Wed, Dec 28, 2011 at 1:26 PM, Camille Fournier <[EMAIL PROTECTED]> > >> wrote: > >> > I'm not sure I agree in the assumption that monitoring pulls happen > >> > infrequently... > >> > >> Well that would be bad news then - see "stat" command processing and > >> similar. All the more reason to look at this in more depth. > >> > >> Patrick > >> > >> > >> > On Dec 28, 2011 4:24 PM, "Patrick Hunt" <[EMAIL PROTECTED]> wrote: > >> > > >> >> They seem like distinct changes to me. In particular getting the size > >> >> is going to happen infrequently (monitoring pull) so I don't see a > >> >> problem fixing the existing patch in the same way the code currently > >> >> handles cnxns access, with a separate jira to do the refactoring. Am > I > >> >> missing something? > >> >> > >> >> Patrick > >> >> > >> >> On Wed, Dec 28, 2011 at 1:10 PM, Camille Fournier < > [EMAIL PROTECTED]> > >> >> wrote: > >> >> > I don't think creating it as a flat sync is a good idea, so if we > want > >> >> this > >> >> > I think we need to refactor that structure to be concurrent. > >> >> > > >> >> > C > >> >> > On Dec 28, 2011 2:24 PM, "Patrick Hunt" <[EMAIL PROTECTED]> wrote: > >> >> > > >> >> >> I think it needs to be fixed. It's obviously incorrect. Also if > >> >> >> someone changes the underlying implementation at some point it > might > >> > +
Camille Fournier 2011-12-28, 22:21
-
Re: svn commit: r1225200 - in /zookeeper/trunk: ./ src/java/main/org/apache/zookeeper/server/ src/java/test/org/apache/zookeeper/server/quorum/ src/java/test/org/apache/zookeeper/test/Patrick Hunt 2011-12-28, 22:45
I would suggest start clean, "link" the new jira to the old jiras, but
make the new description a useful summary of what this umbrella jira should do and the issues faced. I say umbrella because we should split it into sub-parts, as the whole will be relatively big chunk of work. If you could also add a sub-task right off the bat to address the cnxns issue I think that would be helpful as well. Patrick On Wed, Dec 28, 2011 at 2:21 PM, Camille Fournier <[EMAIL PROTECTED]> wrote: > :) > > That is a very good question. This should definitely be fixed in 3.5. I > think your general idea in that tracker comment is right on: > > One option that would make this a lot less ugly would be to deprecate >> support for 4letterwords on the client port in 3.5.0 (dep, not remove, >> although provide an option to turn off via config), we'd designate a new >> port specifically for 4letterwords. We'd fix this problem on the new port, >> the old port would have the existing limitations. >> >> 3.6.0 we would remove 4lw on client port entirely. >> >> This is good for a number of reasons IMO – one is that having 4lw on the >> client port is a bit of a security issue in some customers - as any client >> has access to this port and it cannot by definition be firewalled. Many >> admins would like to firewall this. >> >> In the process we could: >> 1) make the new port fully b/w compatible with the existing functionality >> 2) enable long lived sessions rather than polling >> 3) provide support for "extended command syntax" which would enhance 4lw >> features (for example to control the format of the output) >> 4) etc... >> > > Especially the points 1-4... we need this to be a handshake model, not just > a connect and dump model. > I want to work on this, but I will probably need someone a bit better at > python to rewrite your monitoring server to use it. > I would also welcome help on this issue, if there's someone else is > interested in tackling it, I am happy to provide feedback and guidance. > > Should I start clean and make a new tracker or fold it into 1197? > > C > > On Wed, Dec 28, 2011 at 5:12 PM, Patrick Hunt <[EMAIL PROTECTED]> wrote: > >> On Wed, Dec 28, 2011 at 1:53 PM, Camille Fournier <[EMAIL PROTECTED]> >> wrote: >> > Fair enough. Syncs for all! It's a drop in the bucket compared to the >> fact >> > that we aren't handling the fundamental socket connections for 4lws >> > correctly anyway. >> >> sorry. I see a blog post coming on. ;-) >> >> You're referring to the following (any more?) >> https://issues.apache.org/jira/browse/ZOOKEEPER-1197 >> https://issues.apache.org/jira/browse/ZOOKEEPER-1237 >> >> is this comment still a correct interpretation of the issue? >> >> https://issues.apache.org/jira/browse/ZOOKEEPER-737?focusedCommentId=13109067&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13109067 >> >> What would you like to do about these? I had proposed having a >> separate 4lw port some time back in order to simplify/isolate. Should >> we do this for 3.5.0? (deprecate the old port usage but keep it around >> for a while) Something else? >> >> Patrick >> >> > On Wed, Dec 28, 2011 at 4:32 PM, Patrick Hunt <[EMAIL PROTECTED]> wrote: >> > >> >> On Wed, Dec 28, 2011 at 1:26 PM, Camille Fournier <[EMAIL PROTECTED]> >> >> wrote: >> >> > I'm not sure I agree in the assumption that monitoring pulls happen >> >> > infrequently... >> >> >> >> Well that would be bad news then - see "stat" command processing and >> >> similar. All the more reason to look at this in more depth. >> >> >> >> Patrick >> >> >> >> >> >> > On Dec 28, 2011 4:24 PM, "Patrick Hunt" <[EMAIL PROTECTED]> wrote: >> >> > >> >> >> They seem like distinct changes to me. In particular getting the size >> >> >> is going to happen infrequently (monitoring pull) so I don't see a >> >> >> problem fixing the existing patch in the same way the code currently >> >> >> handles cnxns access, with a separate jira to do the refactoring. Am >> I >> >> >> missing something? +
Patrick Hunt 2011-12-28, 22:45
|