Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Zookeeper >> mail # user >> Re: need for more conditional write support


Copy link to this message
-
Re: need for more conditional write support
I expect any of these proposals will be relatively simple to implement. My
understanding is that ZK serialises *all* accesses to the data tree, so
there's no need to worry about acquiring individual locks for each znode.
This should be a cautionary note on performance, however: as there is no
parallelism in the execution of updates (although there is plenty in the
serialisation process) we should build a mechanism to constrain how much
work this operation can perform, otherwise there's a danger of hurting
throughput for all clients of a cluster. ZK could do with namespaces, or a
cleverer locking mechanism like traditional databases use, to mitigate this
issue but that's a much larger undertaking.

Figuring out a good API for the ensemble to implement can be slightly
decoupled from the API that a client application sees. Therefore I prefer
the list parameters API, which can be wrapped in a builder API in Java if
that makes sense (these kinds of API are less natural in C, for example).

cheers,
Henry

On 16 December 2010 10:39, Jared Cantwell <[EMAIL PROTECTED]> wrote:

> I think that syntactic sugar can be very limiting.  What if you have X
> children you would like to update, but don't know X until runtime?  I like
> the idea of lists that don't have to be subsets of each other, giving more
> flexibility.  I also think it would be interesting to discuss what
> additional recipes could be developed with this api.
>
> ~Jared
>
> On Thu, Dec 16, 2010 at 1:06 PM, Dave Wright <[EMAIL PROTECTED]> wrote:
>
> > I'm not sure why (other than your syntax) you would require the second
> > list (to update) to be a subset of the first (to test). There are
> > plenty of situations where you may want to update one node based on
> > the value of another (and test that the value hasn't changed before
> > updating) but don't really care about the second node, and it would
> > just be extra overhead to check it's current value. In fact, I think
> > that was the OP's situation.
> >
> > -Dave
> >
> > On Thu, Dec 16, 2010 at 1:01 PM, Ted Dunning <[EMAIL PROTECTED]>
> > wrote:
> > > Yes.  This is isomorphic to my suggestion to allow null data.  We
> should
> > > toss around many options to figure out which is the most congenial
> idiom.
> > >  Yours is nice since it has two sets of parallel lists.
> > >
> > > In java with optional arguments it would be possible to use a builder
> > style
> > > with optional arguments:
> > >
> > >               zk.testVersions(node1, version1, node2, version2, ...)
> > >                       .updateData(node1, data1, node3, data3, ...)
> > >
> > > I would tend to make it part of the contract that the nodes in the
> second
> > > part be a subset of of the nodes in the first part.  The first method
> > would
> > > create an object packaging up the first set of args and the second
> method
> > > would do the work.  Of course, this is just syntactic sugar for the
> more
> > > list oriented version.
> > >
> > > On Thu, Dec 16, 2010 at 8:16 AM, Dave Wright <[EMAIL PROTECTED]>
> wrote:
> > >
> > >> My recommendation would actually be a combination of the two which
> > >> offers the most flexibility:
> > >>
> > >> zoo_multi_test_and_set(List<string> znodesToTest, List<int> versions,
> > >> List<string> znodesToSet, List<byte[]> data)
> > >>
> > >> ...this specifies a list of nodes & versions to check, and if the
> > >> versions match, a list of nodes to set and the associated data.
> > >> This allows multiple scenarios, including setting nodes other than the
> > >> ones you are version checking, setting more nodes than you version
> > >> check, checking more nodes than you set, etc.
> > >> I don't think the implementation would be any harder than either of
> the
> > >> others.
> > >>
> > >> -Dave
> > >>
> > >>
> > >> On Wed, Dec 15, 2010 at 10:50 AM, Ted Dunning <[EMAIL PROTECTED]>
> > >> wrote:
> > >> > Well, I would just call the first method set.
> > >> >
> > >> > And I think that the second method is no easier to implement and

Henry Robinson
Software Engineer
Cloudera
415-994-6679