


BatchScanner sort question
Hi,
in the BatchScanner JavaDoc it says "Also only use this *when you do not care about the returned data being in sorted order*.* *If you want to lookup a few ranges and expect those ranges to contain a lot of data, then use the Scanner instead. Also, the Scanner will return data in sorted order, this will not."
I'm not a 100% sure how to interpret this, so I was wondering if anyone of you could help me clarify that:
*Option 1)* Rows are not sorted, but all Key/Value Pairs with the same Row Key are in sequence
Example: Format: Key:CF:CQ:Value A:CF1:CQ1:1 A:CF2:CQ2:2 C:CF1:CQ1:1 B:CF1:CQ1:1
*Option2)* Rows are not sorted and not even Key/Value Pairs with the same Row Key are in sequence
Example: Format: Key:CF:CQ:Value A:CF1:CQ1:1 C:CF1:CQ1:1 A:CF2:CQ2:2 B:CF1:CQ1:1 Thanks, Peter

Re: BatchScanner sort question
The batch scanner works by getting batches from all tablets in the scan. This will typically result in getting sequential batches that are in nonsequential ordering. Because batches are solely based on individual keyvalue pairs, it is possible to get a batch that ends midrow such that the following key is a completely different key, also possibly midrow. If you want to guarantee entire rows, the whole row iterator can be used.
tldr; Option2 is accurate, but you can force Option1 to occur On Fri, Oct 25, 2013 at 12:59 PM, Peter Rainer <[EMAIL PROTECTED]>wrote:
> Hi, > > in the BatchScanner JavaDoc it says "Also only use this *when you do not > care about the returned data being in sorted order*.* *If you want to > lookup a few ranges and expect those ranges to contain a lot of data, then > use the Scanner instead. Also, the Scanner will return data in sorted > order, this will not." > > I'm not a 100% sure how to interpret this, so I was wondering if anyone of > you could help me clarify that: > > *Option 1)* > Rows are not sorted, but all Key/Value Pairs with the same Row Key are in > sequence > > Example: > Format: Key:CF:CQ:Value > A:CF1:CQ1:1 > A:CF2:CQ2:2 > C:CF1:CQ1:1 > B:CF1:CQ1:1 > > *Option2)* > Rows are not sorted and not even Key/Value Pairs with the same Row Key are > in sequence > > Example: > Format: Key:CF:CQ:Value > A:CF1:CQ1:1 > C:CF1:CQ1:1 > A:CF2:CQ2:2 > B:CF1:CQ1:1 > > > Thanks, > Peter > >

Re: BatchScanner sort question
Thanks John, that does help me a lot On Fri, Oct 25, 2013 at 7:03 PM, John Vines <[EMAIL PROTECTED]> wrote:
> The batch scanner works by getting batches from all tablets in the scan. > This will typically result in getting sequential batches that are in > nonsequential ordering. Because batches are solely based on individual > keyvalue pairs, it is possible to get a batch that ends midrow such that > the following key is a completely different key, also possibly midrow. If > you want to guarantee entire rows, the whole row iterator can be used. > > tldr; Option2 is accurate, but you can force Option1 to occur > > > On Fri, Oct 25, 2013 at 12:59 PM, Peter Rainer <[EMAIL PROTECTED]>wrote: > >> Hi, >> >> in the BatchScanner JavaDoc it says "Also only use this *when you do not >> care about the returned data being in sorted order*.* *If you want to >> lookup a few ranges and expect those ranges to contain a lot of data, then >> use the Scanner instead. Also, the Scanner will return data in sorted >> order, this will not." >> >> I'm not a 100% sure how to interpret this, so I was wondering if anyone >> of you could help me clarify that: >> >> *Option 1)* >> Rows are not sorted, but all Key/Value Pairs with the same Row Key are in >> sequence >> >> Example: >> Format: Key:CF:CQ:Value >> A:CF1:CQ1:1 >> A:CF2:CQ2:2 >> C:CF1:CQ1:1 >> B:CF1:CQ1:1 >> >> *Option2)* >> Rows are not sorted and not even Key/Value Pairs with the same Row Key >> are in sequence >> >> Example: >> Format: Key:CF:CQ:Value >> A:CF1:CQ1:1 >> C:CF1:CQ1:1 >> A:CF2:CQ2:2 >> B:CF1:CQ1:1 >> >> >> Thanks, >> Peter >> >> >

