Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Accumulo >> mail # user >> Suggestions on modeling a composite row key


Copy link to this message
-
Re: Suggestions on modeling a composite row key
Excellent, thanks everyone for all the suggestions!

Mike
On Wed, Feb 27, 2013 at 9:50 AM, Keith Turner <[EMAIL PROTECTED]> wrote:

> On Wed, Feb 27, 2013 at 3:03 AM, Christopher <[EMAIL PROTECTED]> wrote:
> > Check out Typo: https://github.com/keith-turner/typo
> > What you're describing is the motivation for that little utility API.
>
> Also, you do not have to use the Typo API.  You could use the
> Lexicoders that you need inorder to encode things so that they sort
> properly lexicographically.
>
>
> https://github.com/keith-turner/typo/blob/master/src/main/java/org/apache/accumulo/typo/encoders/Lexicoder.java
>
> https://github.com/keith-turner/typo/blob/master/src/main/java/org/apache/accumulo/typo/encoders/PairLexicoder.java
>
>
> >
> > Alternatively, if you don't care about the overhead costs or human
> > readability, you could use a modified base64 encoding of your binary
> > key components that preserves the ordering (such as
> > http://iharder.sourceforge.net/current/java/base64/ which I found with
> > Google just now), encode them individually, and join them using a
> > delimiter of your choosing (so long as your delimiter is
> > lexicographically ordered prior to all the bytes in the output bytes
> > of your order-preserving encoding).
> >
> > --
> > Christopher L Tubbs II
> > http://gravatar.com/ctubbsii
> >
> >
> > On Tue, Feb 26, 2013 at 8:51 PM, Mike Hugo <[EMAIL PROTECTED]> wrote:
> >> I need to build up a row key that consists of two parts, the first
> being a
> >> URL (e.g. http://foo.com/dir/page%20name.htm) and the second being a
> number
> >> (e.g. "12").
> >>
> >> To date we've been using \u0000 to delimit these two pieces of the key,
> but
> >> that has some headaches associated with it.
> >>
> >> I'm curious to know how other people have delimited composite row keys.
>  Any
> >> best practices or suggestions?
> >>
> >> Thanks,
> >>
> >> Mike
>