Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Accumulo >> mail # user >> Suggestions on modeling a composite row key


Copy link to this message
-
Re: Suggestions on modeling a composite row key
Excellent, thanks everyone for all the suggestions!

Mike
On Wed, Feb 27, 2013 at 9:50 AM, Keith Turner <[EMAIL PROTECTED]> wrote:

> On Wed, Feb 27, 2013 at 3:03 AM, Christopher <[EMAIL PROTECTED]> wrote:
> > Check out Typo: https://github.com/keith-turner/typo
> > What you're describing is the motivation for that little utility API.
>
> Also, you do not have to use the Typo API.  You could use the
> Lexicoders that you need inorder to encode things so that they sort
> properly lexicographically.
>
>
> https://github.com/keith-turner/typo/blob/master/src/main/java/org/apache/accumulo/typo/encoders/Lexicoder.java
>
> https://github.com/keith-turner/typo/blob/master/src/main/java/org/apache/accumulo/typo/encoders/PairLexicoder.java
>
>
> >
> > Alternatively, if you don't care about the overhead costs or human
> > readability, you could use a modified base64 encoding of your binary
> > key components that preserves the ordering (such as
> > http://iharder.sourceforge.net/current/java/base64/ which I found with
> > Google just now), encode them individually, and join them using a
> > delimiter of your choosing (so long as your delimiter is
> > lexicographically ordered prior to all the bytes in the output bytes
> > of your order-preserving encoding).
> >
> > --
> > Christopher L Tubbs II
> > http://gravatar.com/ctubbsii
> >
> >
> > On Tue, Feb 26, 2013 at 8:51 PM, Mike Hugo <[EMAIL PROTECTED]> wrote:
> >> I need to build up a row key that consists of two parts, the first
> being a
> >> URL (e.g. http://foo.com/dir/page%20name.htm) and the second being a
> number
> >> (e.g. "12").
> >>
> >> To date we've been using \u0000 to delimit these two pieces of the key,
> but
> >> that has some headaches associated with it.
> >>
> >> I'm curious to know how other people have delimited composite row keys.
>  Any
> >> best practices or suggestions?
> >>
> >> Thanks,
> >>
> >> Mike
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB