Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # dev >> InternalScanner next(..) methods


+
Matt Corgan 2012-12-09, 07:27
+
lars hofhansl 2012-12-09, 20:52
+
Matt Corgan 2012-12-10, 01:03
+
Stack 2012-12-10, 20:26
+
Dave Latham 2012-12-11, 18:54
+
lars hofhansl 2012-12-11, 06:56
+
Matt Corgan 2012-12-11, 07:16
+
lars hofhansl 2012-12-11, 07:28
+
Matt Corgan 2012-12-11, 08:08
Copy link to this message
-
Re: InternalScanner next(..) methods
On Sat, Dec 8, 2012 at 11:27 PM, Matt Corgan <[EMAIL PROTECTED]> wrote:

> I'm looking at the KeyValueHeap trying to see how we can make it work with
> Cells.  I'm curious, in this method
>
>   @Override
>   public boolean next(List<KeyValue> result, int limit, String metric)
> throws IOException {
>     if (this.current == null) {
>       return false;
>     }
>     InternalScanner currentAsInternal = (InternalScanner)this.current;
>     boolean mayContainMoreRows = currentAsInternal.next(result, limit,
> metric);
>
> how is it getting multiple results from a single scanner without putting
> the scanner back on the heap?  Couldn't that skip KeyValues?  Is it that
> it's only used at the Region level where the family-per-file semantics
> guarantee that all KeyValues in a single family will sort together?
>
>

Sort of like Lars, if it strikes the likes of you as voodoo, then it needs
fixing.

> My bigger question is regarding the next(List<KeyValue> result, int limit)
> methods from the InternalScanner interface.  What's the reasoning for
> getting multiple results in one call as opposed to calling the next()
> method a bunch of times?  Buffering the KeyValues in a List like that means
> the Cells would have to be expanded into full KeyValues which would be nice
> to avoid.  Is there some logic that depends on getting a whole row of
> values, even though you may only get a partial row due to the limit param?
>
>
+1 on no buffering as we go up through the server layers

> Similarly, I see there is Filter.filterRow(List<KeyValue>) which looks like
> it's barely used.  Is that an important method?  Doesn't look like it's
> used much, but maybe people have custom Filters that need it.
>

+1 on removing this method if messes us up; "custom" filters that "may" be
using it is not reason to keep it in 0.96 -- the singularity.

St.Ack
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB