|
|
-
Composite key, scan on partial key
Bryan Keller 2010-12-14, 23:28
I had a question about using a Scan on part of a composite key. Say I have order line item rows, and the ID is order ID + line item ID. Each ID is a random string. I want to get all line items for an order with my Scan object.
Setting the startRow on Scan is easy enough, just set it to the order ID and leave off the line item ID. However, because endRow is exclusive, I need to come up with a key that is just past the order ID. This would be straightforward if the keys are numeric (just add one to the order ID), but becomes kind of a kludge when the keys are strings.
Right now I build the keys with a byte separator between the two strings and set it to 0 when storing. Then when I want to scan, I create the startRow with the Order ID + (byte)0, and the endRow with Order ID + (byte)1. Seems like kind of a waste to have that extra byte just for this purpose, though. Is there a better approach, like specifying the endRow inclusively?
-
Re: Composite key, scan on partial key
Ryan Rawson 2010-12-14, 23:32
Hey,
If the order ids are variable, then you will have to use a separator. You then can use a start of 'foo:' and a prefix filter of 'foo:'.
The start,end key wont work with variable length in this way. But the good news is prefix filter is very efficient.
Good luck! -ryan
On Tue, Dec 14, 2010 at 3:28 PM, Bryan Keller <[EMAIL PROTECTED]> wrote: > I had a question about using a Scan on part of a composite key. Say I have order line item rows, and the ID is order ID + line item ID. Each ID is a random string. I want to get all line items for an order with my Scan object. > > Setting the startRow on Scan is easy enough, just set it to the order ID and leave off the line item ID. However, because endRow is exclusive, I need to come up with a key that is just past the order ID. This would be straightforward if the keys are numeric (just add one to the order ID), but becomes kind of a kludge when the keys are strings. > > Right now I build the keys with a byte separator between the two strings and set it to 0 when storing. Then when I want to scan, I create the startRow with the Order ID + (byte)0, and the endRow with Order ID + (byte)1. Seems like kind of a waste to have that extra byte just for this purpose, though. Is there a better approach, like specifying the endRow inclusively?
-
Re: Composite key, scan on partial key
Bryan Keller 2010-12-15, 01:32
Isn't a filter much less efficient than specifying a range with the Scan object?
On Dec 14, 2010, at 3:32 PM, Ryan Rawson wrote:
> Hey, > > If the order ids are variable, then you will have to use a separator. > You then can use a start of 'foo:' and a prefix filter of 'foo:'. > > The start,end key wont work with variable length in this way. But the > good news is prefix filter is very efficient. > > Good luck! > -ryan > > On Tue, Dec 14, 2010 at 3:28 PM, Bryan Keller <[EMAIL PROTECTED]> wrote: >> I had a question about using a Scan on part of a composite key. Say I have order line item rows, and the ID is order ID + line item ID. Each ID is a random string. I want to get all line items for an order with my Scan object. >> >> Setting the startRow on Scan is easy enough, just set it to the order ID and leave off the line item ID. However, because endRow is exclusive, I need to come up with a key that is just past the order ID. This would be straightforward if the keys are numeric (just add one to the order ID), but becomes kind of a kludge when the keys are strings. >> >> Right now I build the keys with a byte separator between the two strings and set it to 0 when storing. Then when I want to scan, I create the startRow with the Order ID + (byte)0, and the endRow with Order ID + (byte)1. Seems like kind of a waste to have that extra byte just for this purpose, though. Is there a better approach, like specifying the endRow inclusively?
-
Re: Composite key, scan on partial key
Ryan Rawson 2010-12-15, 01:35
It isn't too much less efficient, you only select the data you need to. The extra filter call out should be operating on well cached data, and just a few extra comparisons. I dont have concrete benchmarks, but am just speaking based on my knowledge of the codebase. Java is pretty good at dynamic inlining, and the JIT can work wonders.
-ryan
On Tue, Dec 14, 2010 at 5:32 PM, Bryan Keller <[EMAIL PROTECTED]> wrote: > Isn't a filter much less efficient than specifying a range with the Scan object? > > On Dec 14, 2010, at 3:32 PM, Ryan Rawson wrote: > >> Hey, >> >> If the order ids are variable, then you will have to use a separator. >> You then can use a start of 'foo:' and a prefix filter of 'foo:'. >> >> The start,end key wont work with variable length in this way. But the >> good news is prefix filter is very efficient. >> >> Good luck! >> -ryan >> >> On Tue, Dec 14, 2010 at 3:28 PM, Bryan Keller <[EMAIL PROTECTED]> wrote: >>> I had a question about using a Scan on part of a composite key. Say I have order line item rows, and the ID is order ID + line item ID. Each ID is a random string. I want to get all line items for an order with my Scan object. >>> >>> Setting the startRow on Scan is easy enough, just set it to the order ID and leave off the line item ID. However, because endRow is exclusive, I need to come up with a key that is just past the order ID. This would be straightforward if the keys are numeric (just add one to the order ID), but becomes kind of a kludge when the keys are strings. >>> >>> Right now I build the keys with a byte separator between the two strings and set it to 0 when storing. Then when I want to scan, I create the startRow with the Order ID + (byte)0, and the endRow with Order ID + (byte)1. Seems like kind of a waste to have that extra byte just for this purpose, though. Is there a better approach, like specifying the endRow inclusively? > >
-
RE: Composite key, scan on partial key
Jonathan Gray 2010-12-15, 04:53
There might be a little confusion.
Specifying start/stop rows vs. scanning all rows with a filter... yes, clearly the start/stop is far more efficient.
What Ryan is talking about is specifying the start row and then using a filter to determine when you're done with the rows you want. In this case, the difference would probably be negligible.
> -----Original Message----- > From: Ryan Rawson [mailto:[EMAIL PROTECTED]] > Sent: Tuesday, December 14, 2010 5:35 PM > To: [EMAIL PROTECTED] > Subject: Re: Composite key, scan on partial key > > It isn't too much less efficient, you only select the data you need to. The > extra filter call out should be operating on well cached data, and just a few > extra comparisons. I dont have concrete benchmarks, but am just speaking > based on my knowledge of the codebase. Java is pretty good at dynamic > inlining, and the JIT can work wonders. > > -ryan > > On Tue, Dec 14, 2010 at 5:32 PM, Bryan Keller <[EMAIL PROTECTED]> wrote: > > Isn't a filter much less efficient than specifying a range with the Scan > object? > > > > On Dec 14, 2010, at 3:32 PM, Ryan Rawson wrote: > > > >> Hey, > >> > >> If the order ids are variable, then you will have to use a separator. > >> You then can use a start of 'foo:' and a prefix filter of 'foo:'. > >> > >> The start,end key wont work with variable length in this way. But > >> the good news is prefix filter is very efficient. > >> > >> Good luck! > >> -ryan > >> > >> On Tue, Dec 14, 2010 at 3:28 PM, Bryan Keller <[EMAIL PROTECTED]> > wrote: > >>> I had a question about using a Scan on part of a composite key. Say I > have order line item rows, and the ID is order ID + line item ID. Each ID is a > random string. I want to get all line items for an order with my Scan object. > >>> > >>> Setting the startRow on Scan is easy enough, just set it to the order ID > and leave off the line item ID. However, because endRow is exclusive, I need > to come up with a key that is just past the order ID. This would be > straightforward if the keys are numeric (just add one to the order ID), but > becomes kind of a kludge when the keys are strings. > >>> > >>> Right now I build the keys with a byte separator between the two strings > and set it to 0 when storing. Then when I want to scan, I create the startRow > with the Order ID + (byte)0, and the endRow with Order ID + (byte)1. Seems > like kind of a waste to have that extra byte just for this purpose, though. Is > there a better approach, like specifying the endRow inclusively? > > > >
|
|