|
|
Edward Ribeiro 2013-02-13, 20:40
Hi folks,
I was reading DataTree code last night, and I would like to ask some snippets I didn't understand, so, please, bear with me, and excuse me in advance if I say something stupid. I am still finding my way through the code base.
1. createNode()
480 HashSet<String> list = ephemerals.get(ephemeralOwner); 481 if (list =null) { 482 list = new HashSet<String>(); 483ephemerals.put(ephemeralOwner, list); 484 } 485 synchronized (list) { 486 list.add(path); 487 }
The snippet above includes the path in the list of ephemeral nodes, and creates an entry for that specific ephemeralOwner (lines 481-484), if it doesn't exists. Suppose that "ephemerals" is initially empty for a given ephemeralOwner, and you have two threads arriving, at roughly the same time, at line 481. Both will modify the same entry (ephemeralOwner), that is empty, so they will create a new List at line 482.
So far, no problem, because "ephemerals" is a ConcurrentMap so the access will be thread fase, and the last write will win (overriding the previous one). The first problem is that as each thread created its own "list" then each one will synchronize on a different "list" reference at line 485 (defeating the purpose of synchronized for this specific case).
Even here there's no problem because each thread is modifying its own copy of "list", but as one of the was overwritten at "ephemerals" then one of the paths will be eventually be lost (because its "list" was overwritten and not more present in "ephemerals"). I don't know if this specific situation is really feasible or once in a million chance -- I suppose it can happen because if the code is synchronizing at line 485 then it says it's possible that two threads alter the same Set<String> at the same time. Again, excuse me and ignore this message, if I said something really stupid. I am still finding my way through the code base. :)
The solution is I came with is very simple:
480 HashSet<String> list = ephemerals.get(ephemeralOwner); 481 if (list =null) { 482 list = ephemerals.putIfAbsent(ephemeralOwner, new HashSet<String>()); 483 484 } 485 synchronized (list) { 486 list.add(path); 487 }
The putIfAbsent() method will work as follow (quoted from javadoc):
If the specified key is not already associated with a value, associate it with the given value. This is equivalent to
if (!map.containsKey(key)) return map.put(key, value); else return map.get(key);
Now you will always have the two (or more) threads holding references to the same "list" object, line 485 works nicely, so that they don't loose any information to be included
2. The code below leaks the internal keySet of "ephemerals" to the outside world (I looked at ConcurrentHashMap impl and, to my surprise it's not a copy as I've always thought, it's the internal Map's keySett !!! ), and this is potentially dangerous.
199 public Collection<Long> getSessions() { 200 return ephemerals.keySet(); 201 }
I suggest to replace by this
199 public Collection<Long> getSessions() { 200 return Collections.unmodifiableSet<Long>(ephemerals.keySet()); 201 }
or this:
199 public Collection<Long> getSessions() { 200 return Collections.unmodifiableSet<Long>(new HashSet<Long>(ephemerals.keySet())); 201 }
3. Line 408-409 state the following (but they are more lines like this in file):
408 * the session id that owns this node. -1 indicates this is not 409 * an ephemeral node.
But looking at the souce code you see:
479 if (ephemeralOwner != 0) {
1198 if (eowner != 0)
779 createTxn.getEphemeral() ? header.getClientId() : 0,
Ooops, it's not -1, but 0 (zero) that determines a non-ephemeral node. Am I right?
Regards, Edward
Thawan Kooburat 2013-02-14, 05:27
In our branch, where we did a lot conversion from synchronize to atomic variables or concurrent data structure. Our DataTree.createNode() calls this method to update ephemeral map
private void addEphemeralPath(String path, long ephemeralOwner) { if (ephemeralOwner != 0) { Set<String> list = ephemerals.get(ephemeralOwner); if (list == null) { Set<String> newList = Collections.newSetFromMap( new ConcurrentHashMap<String, Boolean>(4, 0.75f, 2)); list = ephemerals.putIfAbsent(ephemeralOwner, newList); if (list == null) { list = newList; } } list.add(path); } }
However, race-condition respect to write-request is not possible at the moment due to ZOOKEEPER-1505. The new CommitProcessor only allow 1 write request at any given time. So there can be only one thread that can modify the DataTree (unless we remove that constraint in the future) -- Thawan Kooburat
Edward Ribeiro 2013-02-14, 15:01
Cool.
I was thinking about something along the lines of Collections.newSetFromMap(new ConcurrentHashMap<...>()) too, so that it'd possible to get rid of synchronized like you did. Another point would be the use of Guava's Multimaps, but then it would loose the benefits of ConcurrentMap.
Thanks for the explanation, Thawan!
Edward
On Thu, Feb 14, 2013 at 3:27 AM, Thawan Kooburat <[EMAIL PROTECTED]> wrote:
> In our branch, where we did a lot conversion from synchronize to atomic > variables or concurrent data structure. Our DataTree.createNode() calls > this method to update ephemeral map > > private void addEphemeralPath(String path, long ephemeralOwner) { > if (ephemeralOwner != 0) { > Set<String> list = ephemerals.get(ephemeralOwner); > if (list == null) { > Set<String> newList = Collections.newSetFromMap( > new ConcurrentHashMap<String, Boolean>(4, 0.75f, 2)); > list = ephemerals.putIfAbsent(ephemeralOwner, newList); > if (list == null) { > list = newList; > } > } > list.add(path); > } > } > > However, race-condition respect to write-request is not possible at the > moment due to ZOOKEEPER-1505. The new CommitProcessor only allow 1 write > request at any given time. So there can be only one thread that can modify > the DataTree (unless we remove that constraint in the future) > > > > -- > Thawan Kooburat > > > > > >
|
|