Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # dev - InternalScanner next(..) methods


Copy link to this message
-
InternalScanner next(..) methods
Matt Corgan 2012-12-09, 07:27
I'm looking at the KeyValueHeap trying to see how we can make it work with
Cells.  I'm curious, in this method

  @Override
  public boolean next(List<KeyValue> result, int limit, String metric)
throws IOException {
    if (this.current == null) {
      return false;
    }
    InternalScanner currentAsInternal = (InternalScanner)this.current;
    boolean mayContainMoreRows = currentAsInternal.next(result, limit,
metric);

how is it getting multiple results from a single scanner without putting
the scanner back on the heap?  Couldn't that skip KeyValues?  Is it that
it's only used at the Region level where the family-per-file semantics
guarantee that all KeyValues in a single family will sort together?

My bigger question is regarding the next(List<KeyValue> result, int limit)
methods from the InternalScanner interface.  What's the reasoning for
getting multiple results in one call as opposed to calling the next()
method a bunch of times?  Buffering the KeyValues in a List like that means
the Cells would have to be expanded into full KeyValues which would be nice
to avoid.  Is there some logic that depends on getting a whole row of
values, even though you may only get a partial row due to the limit param?

Similarly, I see there is Filter.filterRow(List<KeyValue>) which looks like
it's barely used.  Is that an important method?  Doesn't look like it's
used much, but maybe people have custom Filters that need it.

Thanks,
Matt