Paul van Hoven 2013-02-19, 16:11
Mohammad Tariq 2013-02-19, 16:16
Paul van Hoven 2013-02-19, 16:24
-Re: Rowkey design question
Mohammad Tariq 2013-02-19, 17:34
No. before the timestamp. All the row keys which are identical go to the
same region. This is the default Hbase behavior and is meant to make the
performance better. But sometimes the machine gets overloaded with reads
and writes because we get concentrated on that particular machine. For
example timeseries data. So it's better to hash the keys in order to make
them go to all the machines equally. HTH
BTW, did that range query work??
On Tue, Feb 19, 2013 at 9:54 PM, Paul van Hoven <
[EMAIL PROTECTED]> wrote:
> Hey Tariq,
> thanks for your quick answer. I'm not sure if I got the idea in the
> seond part of your answer. You mean if I use a timestamp as a rowkey I
> should append a hash like this:
> and then the data would be distributed more equally?
> 2013/2/19 Mohammad Tariq <[EMAIL PROTECTED]>:
> > Hello Paul,
> > Try this and see if it works :
> > scan.setStartRow(Bytes.toBytes(startDate.getTime() + ""));
> > scan.setStopRow(Bytes.toBytes(endDate.getTime() + 1 + ""));
> > Also try not to use TS as the rowkey, as it may lead to RS hotspotting.
> > Just add a hash to your rowkeys so that data is distributed evenly on all
> > the RSs.
> > Warm Regards,
> > Tariq
> > https://mtariq.jux.com/
> > cloudfront.blogspot.com
> > On Tue, Feb 19, 2013 at 9:41 PM, Paul van Hoven <
> > [EMAIL PROTECTED]> wrote:
> >> Hi,
> >> I'm currently playing with hbase. The design of the rowkey seems to be
> >> critical.
> >> The rowkey for a certain database table of mine is:
> >> timestamp+ipaddress
> >> It looks something like this when performing a scan on the table in the
> >> shell:
> >> hbase(main):012:0> scan 'ToyDataTable'
> >> ROW COLUMN+CELL
> >> 1357020000000+192.168.178.9 column=CF:SampleCol,
> >> timestamp=1361288601717, value=Entry_1 = 2013-01-01 07:00:00
> >> Since I got several rows for different timestamps I'd like to tell a
> >> scan to just a region of the table for example from 2013-01-07 to
> >> 2013-01-09. Previously I only had a timestamp as the rowkey and I
> >> could restrict the rowkey like that:
> >> SimpleDateFormat formatter = new SimpleDateFormat("yyyy-MM-dd
> >> Date startDate = formatter.parse("2013-01-07
> >> 07:00:00");
> >> Date endDate = formatter.parse("2013-01-10
> >> 07:00:00");
> >> HTableInterface toyDataTable > >> pool.getTable("ToyDataTable");
> >> Scan scan = new Scan( Bytes.toBytes(
> >> startDate.getTime() ),
> >> Bytes.toBytes( endDate.getTime() ) );
> >> But this no longer works with my new design.
> >> Is there a way to tell the scan object to filter the rows with respect
> >> to the timestamp, or do I have to use a filter object?
Paul van Hoven 2013-02-19, 17:50
Mohammad Tariq 2013-02-19, 17:54
Asaf Mesika 2013-02-21, 22:15
Mohammad Tariq 2013-02-21, 22:25