Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase, mail # user - Re: Constructing rowkeys and HBASE-7221


+
Doug Meil 2013-01-15, 22:01
+
Aaron Kimball 2013-01-16, 19:06
Copy link to this message
-
Re: Constructing rowkeys and HBASE-7221
Doug Meil 2013-01-17, 13:30

Thanks Aaron!

I will take a look at Kiji.  And I think it underscores the need for some
type of utility row rowkey building/parsing being available in HBase,
because one of the first things folks tend to do is start building their
own keybuilder utility when they start using Hbase (same sentiment also
expressed by others in the HBASE-7221 ticket comments).

It's good that you have full control over the rowkey (i.e., byte[]) as a
backstop, but HBase should also try to make things a bit easier for some
common cases.  I think it will help adoption.

The general idea is a FixedLengthRowKey and a VariableLengthRowKey along
with a RowKeySchema class, and I think that the variant you bring up is a
great idea (e.g., prefix vs. hash).  Let's keep this ball rolling!

On 1/16/13 2:06 PM, "Aaron Kimball" <[EMAIL PROTECTED]> wrote:

>Hi Doug,
>
>This HBase feature is really interesting. It is quite related to some work
>we're doing on Kiji, our schema management project. In particular, we've
>also been focusing on building composite row keys correctly. One thing
>that
>jumped out at me in that ticket is that with a composition of md5hash and
>other (string, int, etc) components, you probably don't want the whole
>hash. If you're using that to shard your rows more efficiently across
>regions, you might want to just use a subset of the md5 bytes as a prefix.
>It might be a good idea to offer users control of this.
>
>Our own thoughts on this on the Kiji side are being tracked at
>https://jira.kiji.org/browse/schema-3 where we have a design doc that goes
>into a bit more detail.
>
>Cheers,
>- Aaron
>
>
>On Tue, Jan 15, 2013 at 2:01 PM, Doug Meil
><[EMAIL PROTECTED]>wrote:
>
>>
>> Hi there, well, this request for input fell like a thud.  :-)
>>
>> But I think perhaps it has to do with the fact that I sent it to the
>> dev-list instead of the user-list, as people that are actively writing
>> HBase itself (devs) need less help with such keybuilding utilities.
>>
>> So one last request for feedback, but this time aimed at users of HBase:
>> how has your key-building experience been?
>>
>> Thanks!
>>
>>
>>
>> On 1/7/13 11:04 AM, "Doug Meil" <[EMAIL PROTECTED]> wrote:
>>
>> >
>> >Greetings folks-
>> >
>> >I would like to restart the conversation on
>> >https://issues.apache.org/jira/browse/HBASE-7221 because there continue
>> >to be conversations on the dist-list about creating composite rowkeys,
>> >and while HBase makes just about anything possible, it doesn¹t make
>>much
>> >easy in this respect.
>> >
>> >What I¹m lobbying for is a utility class (see the v3 patch in
>>HBASE-7221)
>> >that can both create and read rowkeys (so this isn¹t just a one-way
>> >builder pattern).
>> >
>> >This is currently stuck because it was noted that Bytes has an issue
>>with
>> >sort-order of numbers specifically if you have both negative and
>>positive
>> >values, which is really a different issue, but because this patch uses
>> >Bytes it¹s related.
>> >
>> >What are people¹s thoughts on this topic in general, and the v3 version
>> >of the patch specifically?  (and the last set of comments).  Thanks!
>> >
>> >One of the unit tests shows the example of usage.  The last set of
>> >comments suggested that RowKey be renamed FixedLengthRowKey, which I
>> >think is a good idea.  A follow-on patch could include
>> >VariableLengthRowKey for folks that use strings in the rowkeys.
>> >
>> >
>> >  public void testCreate() throws Exception {
>> >
>> >    int elements[] = {RowKeySchema.SIZEOF_MD5_HASH,
>> >RowKeySchema.SIZEOF_INT, RowKeySchema.SIZEOF_LONG};
>> >    RowKeySchema schema = new RowKeySchema(elements);
>> >
>> >    RowKey rowkey = schema.createRowKey();
>> >    rowkey.setHash(0, hashVal);
>> >    rowkey.setInt(1, intVal);
>> >    rowkey.setLong(2, longVal);
>> >
>> >    byte bytes[] = rowkey.getBytes();
>> >    Assert.assertEquals("key length", schema.getRowKeyLength(),
>> >bytes.length);
>> >
>> >    Assert.assertEquals("e1", rowkey.getInt(1), intVal);