Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Zookeeper >> mail # user >> Sanitizing ZooKeeper znode names


+
David Nickerson 2012-07-06, 16:10
Copy link to this message
-
Re: Sanitizing ZooKeeper znode names
I like to use URL encoding. Then I can use the JDK's UrlEncoder.

===================Jordan Zimmerman

On Jul 6, 2012, at 9:11 AM, David Nickerson
<[EMAIL PROTECTED]> wrote:

> I'm writing a distributed locking API based on ZooKeeper. I create nodes
> based on the resource names, but I have no control over what the client
> chooses as their resource name. (Quite often the client uses linux file
> paths, so I have to remove or escape all of the front slashes.)
>
> To clean the node names, I wrote a method that escapes the bad characters.
> The method is called 'normalize': http://pastebin.com/hakkb9Nw .
>
> For example, a front slash becomes \x2f. This method works, but it has a
> few drawbacks. It doesn't deal with unicode characters greater then 16 bits
> in size, and it's impossible to reverse the escape process. Also,
> crucially, it is possible that two different resources will result in the
> same znode name, which could cause all kinds of trouble.
>
> A more reliable approach would be to convert the resource name into hex.
> For example:
>
> import javax.xml.bind.DatatypeConverter;
>
> DatatypeConverter.printHexBinary(string.getBytes())
>
> This would always result in a safe and unique node name. (It will never
> result in the token "zookeeper" because "zookeeper" has an odd number of
> characters.) The only problem with this is that it becomes impossible to
> read and understand the resource names from ZooKeeper unless you reverse
> the process:
>
> new String(DatatypeConverter.parseHexBinary(hex))
>
> So I'm wondering, is there a standard or recommended practice for
> sanitizing znode names? If not, which approach would you recommend?
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB