Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase, mail # user - Rowkey design question


+
Paul van Hoven 2013-02-19, 16:11
+
Mohammad Tariq 2013-02-19, 16:16
Copy link to this message
-
Re: Rowkey design question
Paul van Hoven 2013-02-19, 16:24
Hey Tariq,

thanks for your quick answer. I'm not sure if I got the idea in the
seond part of your answer. You mean if I use a timestamp as a rowkey I
should append a hash like this:

1357279200000+MD5HASH

and then the data would be distributed more equally?
2013/2/19 Mohammad Tariq <[EMAIL PROTECTED]>:
> Hello Paul,
>
>     Try this and see if it works :
>        scan.setStartRow(Bytes.toBytes(startDate.getTime() + ""));
>        scan.setStopRow(Bytes.toBytes(endDate.getTime() + 1 + ""));
>
> Also try not to use TS as the rowkey, as it may lead to RS hotspotting.
> Just add a hash to your rowkeys so that data is distributed evenly on all
> the RSs.
>
> Warm Regards,
> Tariq
> https://mtariq.jux.com/
> cloudfront.blogspot.com
>
>
> On Tue, Feb 19, 2013 at 9:41 PM, Paul van Hoven <
> [EMAIL PROTECTED]> wrote:
>
>> Hi,
>>
>> I'm currently playing with hbase. The design of the rowkey seems to be
>> critical.
>>
>> The rowkey for a certain database table of mine is:
>>
>> timestamp+ipaddress
>>
>> It looks something like this when performing a scan on the table in the
>> shell:
>> hbase(main):012:0> scan 'ToyDataTable'
>> ROW                                         COLUMN+CELL
>>  1357020000000+192.168.178.9                column=CF:SampleCol,
>> timestamp=1361288601717, value=Entry_1 = 2013-01-01 07:00:00
>>
>> Since I got several rows for different timestamps I'd like to tell a
>> scan to just a region of the table for example from 2013-01-07 to
>> 2013-01-09. Previously I only had a timestamp as the rowkey and I
>> could restrict the rowkey like that:
>>
>> SimpleDateFormat formatter = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");
>>                         Date startDate = formatter.parse("2013-01-07
>> 07:00:00");
>>                         Date endDate = formatter.parse("2013-01-10
>> 07:00:00");
>>
>>                         HTableInterface toyDataTable >> pool.getTable("ToyDataTable");
>>                         Scan scan = new Scan( Bytes.toBytes(
>> startDate.getTime() ),
>> Bytes.toBytes( endDate.getTime() ) );
>>
>> But this no longer works with my new design.
>>
>> Is there a way to tell the scan object to filter the rows with respect
>> to the timestamp, or do I have to use a filter object?
>>
+
Mohammad Tariq 2013-02-19, 17:34
+
Paul van Hoven 2013-02-19, 17:50
+
Mohammad Tariq 2013-02-19, 17:54
+
Asaf Mesika 2013-02-21, 22:15
+
Mohammad Tariq 2013-02-21, 22:25