David Nickerson 2012-07-06, 16:10
I like to use URL encoding. Then I can use the JDK's UrlEncoder.
On Jul 6, 2012, at 9:11 AM, David Nickerson
<[EMAIL PROTECTED]> wrote:
> I'm writing a distributed locking API based on ZooKeeper. I create nodes
> based on the resource names, but I have no control over what the client
> chooses as their resource name. (Quite often the client uses linux file
> paths, so I have to remove or escape all of the front slashes.)
> To clean the node names, I wrote a method that escapes the bad characters.
> The method is called 'normalize': http://pastebin.com/hakkb9Nw .
> For example, a front slash becomes \x2f. This method works, but it has a
> few drawbacks. It doesn't deal with unicode characters greater then 16 bits
> in size, and it's impossible to reverse the escape process. Also,
> crucially, it is possible that two different resources will result in the
> same znode name, which could cause all kinds of trouble.
> A more reliable approach would be to convert the resource name into hex.
> For example:
> import javax.xml.bind.DatatypeConverter;
> This would always result in a safe and unique node name. (It will never
> result in the token "zookeeper" because "zookeeper" has an odd number of
> characters.) The only problem with this is that it becomes impossible to
> read and understand the resource names from ZooKeeper unless you reverse
> the process:
> new String(DatatypeConverter.parseHexBinary(hex))
> So I'm wondering, is there a standard or recommended practice for
> sanitizing znode names? If not, which approach would you recommend?