Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Accumulo >> mail # user >> Suggestions on modeling a composite row key

Copy link to this message
Re: Suggestions on modeling a composite row key
At sqrrl, we tend to use a Tuple class that implements List<String>
(List<ByteBuffer> would also work), and has conversions to and from
ByteBuffer. To encode the tuple into a byte buffer, change all the "\1"s to
"\1\2", change all the "\0"s to "\1\1", and put a "\0" byte between
elements. "\1" is used as an escape character for all of the "\1"s and
"\0"s appearing in the the unencoded form. To decode, just split on "\0"
and reverse the escaping. This encoding preserves hierarchical,
lexicographical ordering of tuple elements.


On Tue, Feb 26, 2013 at 11:51 PM, Mike Hugo <[EMAIL PROTECTED]> wrote:

> I need to build up a row key that consists of two parts, the first being a
> URL (e.g. http://foo.com/dir/page%20name.htm) and the second being a
> number (e.g. "12").
> To date we've been using \u0000 to delimit these two pieces of the key,
> but that has some headaches associated with it.
> I'm curious to know how other people have delimited composite row keys.
>  Any best practices or suggestions?
> Thanks,
> Mike