Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Zookeeper >> mail # dev >> Mounting a remote Zookeeper


Copy link to this message
-
RE: Mounting a remote Zookeeper
This is a preliminary proposal, so everything is still open. Still, I think there are many advantages over the previous namespace partitioning proposal (http://wiki.apache.org/hadoop/ZooKeeper/PartitionedZookeeper) that wasn't implemented AFAIK. The idea here is to make much smaller and more intuitive changes.
For example, the previous proposal did not offer any ordering guarantees across partitions. Also - in Linux mount you don't need to specify for each new file which mount point the file belongs to - we can exploit the tree structure to infer that instead of creating and maintaining an additional hierarchy like in the previous proposal.

> what happens when a client does a read on the remote ZK cluster. does the read always get
> forwarded to the remote cluster?

No. The idea is to identify when inter-cluster communication is necessary to maintain sequential consistency and otherwise avoid it. In the twiki we propose such a possible rule. For example, if you read from a remote partition that didn't mount any part of your local namespace, it's ok to return an old value. In any case, the read is never forwarded to the remote cluster - even if inter-cluster communication is necessary, we sync the observer with the remote leader and then read from the observer.

> in your proposal, what happens if an a client creates an ephemeral
> node on the remote ZK cluster. who does the failure detection and clean up?

You're right, we should definitely address that in the twiki. I think that in any case a cluster should only monitor the clients connected to that cluster and not clients connected to remote clusters. So if we support creating remote ephemeral nodes I think failure detection should be done locally and the remote cluster should subscribe to relevant local failure events and be notified.

> what happens if the request to the remote cluster hangs?

A user can determine what happens in this case. If he wants all his following requests to fail, a remote request will block all his following requests. Otherwise a remote request can fail and still his following local requests can succeed.

Thanks,
Alex

> -----Original Message-----
> From: Benjamin Reed [mailto:[EMAIL PROTECTED]]
> Sent: Thursday, June 09, 2011 4:05 PM
> To: [EMAIL PROTECTED]
> Subject: Re: Mounting a remote Zookeeper
>
> this is a small nit, but i think the partition proposal works a bit
> more like a mount point than your proposal. when you mount a file
> system, the mount isn't transparent. two mounted file systems can have
> files with the same inode number, for example. you also can't do some
> things like a rename across file system boundaries.
>
> in your proposal, what happens if an a client creates an ephemeral
> node on the remote ZK cluster. who does the failure detection and
> clean up? it also wasn't clear what happens when a client does a read
> on the remote ZK cluster. does the read always get forwarded to the
> remote cluster? also what happens if the request to the remote cluster
> hangs?
>
> thanx
> ben
>
> On Thu, Jun 9, 2011 at 11:41 AM, Alexander Shraer <shralex@yahoo-
> inc.com> wrote:
> > Hi,
> >
> > We're considering working on a new feature that will allow "mounting"
> part of the namespace of one ZK cluster into another ZK cluster. The
> goal is essentially to be able to partition a ZK namespace while
> preserving current ZK semantics as much as possible.
> > More details are here:
> http://wiki.apache.org/hadoop/ZooKeeper/MountRemoteZookeeper
> >
> > It would be great to get your feedback and especially please let us
> know if you think your application can benefit from this feature.
> >
> > Thanks,
> > Alex Shraer and Eddie Bortnikov
> >
> >
> >
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB