Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Accumulo >> mail # user >> BatchScanner sort question


Copy link to this message
-
Re: BatchScanner sort question
The batch scanner works by getting batches from all tablets in the scan.
This will typically result in getting sequential batches that are in
non-sequential ordering. Because batches are solely based on individual
key-value pairs, it is possible to get a batch that ends mid-row such that
the following key is a completely different key, also possibly mid-row. If
you want to guarantee entire rows, the whole row iterator can be used.

tldr; Option2 is accurate, but you can force Option1 to occur
On Fri, Oct 25, 2013 at 12:59 PM, Peter Rainer <[EMAIL PROTECTED]>wrote:

> Hi,
>
> in the BatchScanner JavaDoc it says "Also only use this *when you do not
> care about the returned data being in sorted order*.* *If you want to
> lookup a few ranges and expect those ranges to contain a lot of data, then
> use the Scanner instead. Also, the Scanner will return data in sorted
> order, this will not."
>
> I'm not a 100% sure how to interpret this, so I was wondering if anyone of
> you could help me clarify that:
>
> *Option 1)*
> Rows are not sorted, but all Key/Value Pairs with the same Row Key are in
> sequence
>
> Example:
> Format: Key:CF:CQ:Value
> A:CF1:CQ1:1
> A:CF2:CQ2:2
> C:CF1:CQ1:1
> B:CF1:CQ1:1
>
> *Option2)*
> Rows are not sorted and not even Key/Value Pairs with the same Row Key are
> in sequence
>
> Example:
> Format: Key:CF:CQ:Value
> A:CF1:CQ1:1
> C:CF1:CQ1:1
> A:CF2:CQ2:2
> B:CF1:CQ1:1
>
>
> Thanks,
> Peter
>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB