Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Re: Constructing rowkeys and HBASE-7221

Copy link to this message
Re: Constructing rowkeys and HBASE-7221
Hi Doug,

This HBase feature is really interesting. It is quite related to some work
we're doing on Kiji, our schema management project. In particular, we've
also been focusing on building composite row keys correctly. One thing that
jumped out at me in that ticket is that with a composition of md5hash and
other (string, int, etc) components, you probably don't want the whole
hash. If you're using that to shard your rows more efficiently across
regions, you might want to just use a subset of the md5 bytes as a prefix.
It might be a good idea to offer users control of this.

Our own thoughts on this on the Kiji side are being tracked at
https://jira.kiji.org/browse/schema-3 where we have a design doc that goes
into a bit more detail.

- Aaron
On Tue, Jan 15, 2013 at 2:01 PM, Doug Meil <[EMAIL PROTECTED]>wrote:

> Hi there, well, this request for input fell like a thud.  :-)
> But I think perhaps it has to do with the fact that I sent it to the
> dev-list instead of the user-list, as people that are actively writing
> HBase itself (devs) need less help with such keybuilding utilities.
> So one last request for feedback, but this time aimed at users of HBase:
> how has your key-building experience been?
> Thanks!
> On 1/7/13 11:04 AM, "Doug Meil" <[EMAIL PROTECTED]> wrote:
> >
> >Greetings folks-
> >
> >I would like to restart the conversation on
> >https://issues.apache.org/jira/browse/HBASE-7221 because there continue
> >to be conversations on the dist-list about creating composite rowkeys,
> >and while HBase makes just about anything possible, it doesn¹t make much
> >easy in this respect.
> >
> >What I¹m lobbying for is a utility class (see the v3 patch in HBASE-7221)
> >that can both create and read rowkeys (so this isn¹t just a one-way
> >builder pattern).
> >
> >This is currently stuck because it was noted that Bytes has an issue with
> >sort-order of numbers specifically if you have both negative and positive
> >values, which is really a different issue, but because this patch uses
> >Bytes it¹s related.
> >
> >What are people¹s thoughts on this topic in general, and the v3 version
> >of the patch specifically?  (and the last set of comments).  Thanks!
> >
> >One of the unit tests shows the example of usage.  The last set of
> >comments suggested that RowKey be renamed FixedLengthRowKey, which I
> >think is a good idea.  A follow-on patch could include
> >VariableLengthRowKey for folks that use strings in the rowkeys.
> >
> >
> >  public void testCreate() throws Exception {
> >
> >    int elements[] = {RowKeySchema.SIZEOF_MD5_HASH,
> >RowKeySchema.SIZEOF_INT, RowKeySchema.SIZEOF_LONG};
> >    RowKeySchema schema = new RowKeySchema(elements);
> >
> >    RowKey rowkey = schema.createRowKey();
> >    rowkey.setHash(0, hashVal);
> >    rowkey.setInt(1, intVal);
> >    rowkey.setLong(2, longVal);
> >
> >    byte bytes[] = rowkey.getBytes();
> >    Assert.assertEquals("key length", schema.getRowKeyLength(),
> >bytes.length);
> >
> >    Assert.assertEquals("e1", rowkey.getInt(1), intVal);
> >    Assert.assertEquals("e2", rowkey.getLong(2), longVal);
> >  }
> >
> >Doug Meil
> >Chief Software Architect, Explorys
> >