Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Poor HBase map-reduce scan performance


Copy link to this message
-
Re: Poor HBase map-reduce scan performance
It seems to be in the ballpark of what I was getting at, but I haven't
fully digested the code yet, so I can't say for sure.

Here's what I'm getting at.  Looking at
o.a.h.h.client.ClientScanner.next() in the 94.2 source I have loaded, I
see there are three branches with respect to the cache:

public Result next() throws IOException {
  // If the scanner is closed and there's nothing left in the cache, next
is a no-op.
  if (cache.size() == 0 && this.closed) {
    return null;
  }

  if (cache.size() == 0) {
// Request more results from RS
  ...
  }

  if (cache.size() > 0) {
    return cache.poll();
  }

  ...
  return null;

}
I think that middle branch wants to change as follows (pseudo-code):

if the cache size is below a certain threshold then
  initiate asynchronous action to refill it
  if there is no result to return until the cache refill completes then
    block
  done
done

Or something along those lines.  I haven't grokked the patch well enough
yet to tell if that's what it does.  What I think is happening in the
0.94.2 code I've got is that it requests nothing until the cache is empty,
then blocks until it's non-empty.  We want to eagerly and asynchronously
refill the cache so that we ideally never have to block.
Sandy
On 5/22/13 1:39 PM, "Ted Yu" <[EMAIL PROTECTED]> wrote:

>Sandy:
>Do you think the following JIRA would help with what you expect in this
>regard ?
>
>HBASE-8420 Port HBASE-6874 Implement prefetching for scanners from 0.89-fb
>
>Cheers
>
>On Wed, May 22, 2013 at 1:29 PM, Sandy Pratt <[EMAIL PROTECTED]> wrote:
>
>> I found this thread on search-hadoop.com just now because I've been
>> wrestling with the same issue for a while and have as yet been unable to
>> solve it.  However, I think I have an idea of the problem.  My theory is
>> based on assumptions about what's going on in HBase and HDFS internally,
>> so please correct me if I'm wrong.
>>
>> Briefly, I think the issue is that sequential reads from HDFS are
>> pipelined, whereas sequential reads from HBase are not.  Therefore,
>> sequential reads from HDFS tend to keep the IO subsystem saturated,
>>while
>> sequential reads from HBase allow it to idle for a relatively large
>> proportion of time.
>>
>> To make this more concrete, suppose that I'm reading N bytes of data
>>from
>> a file in HDFS.  I issue the calls to open the file and begin to read
>> (from an InputStream, for example).  As I'm reading byte 1 of the stream
>> at my client, the datanode is reading byte M where 1 < M <= N from disk.
>> Thus, three activities tend to happen concurrently for the most part
>> (disregarding the beginning and end of the file): 1) processing at the
>> client; 2) streaming over the network from datanode to client; and 3)
>> reading data from disk at the datanode.  The proportion of time these
>> three activities overlap tends towards 100% as N -> infinity.
>>
>> Now suppose I read a batch of R records from HBase (where R = whatever
>> scanner caching happens to be).  As I understand it, I issue my call to
>> ResultScanner.next(), and this causes the RegionServer to block as if
>>on a
>> page fault while it loads enough HFile blocks from disk to cover the R
>> records I (implicitly) requested.  After the blocks are loaded into the
>> block cache on the RS, the RS returns R records to me over the network.
>> Then I process the R records locally.  When they are exhausted, this
>>cycle
>> repeats.  The notable upshot is that while the RS is faulting HFile
>>blocks
>> into the cache, my client is blocked.  Furthermore, while my client is
>> processing records, the RS is idle with respect to work on behalf of my
>> client.
>>
>> That last point is really the killer, if I'm correct in my assumptions.
>> It means that Scanner caching and larger block sizes work only to
>>amortize
>> the fixed overhead of disk IOs and RPCs -- they do nothing to keep the
>>IO
>> subsystems saturated during sequential reads.  What *should* happen is
>> that the RS should treat the Scanner caching value (R above) as a hint
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB