|
|
-
Re: independent scans to same region processed seriallylars hofhansl 2013-02-09, 19:04
If you had something that'd be great. Preferrable with a local/single region server.
(Maybe time to take this private :) ) -- Lars ----- Original Message ----- From: James Taylor <[EMAIL PROTECTED]> To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>; lars hofhansl <[EMAIL PROTECTED]> Cc: Sent: Saturday, February 9, 2013 9:28 AM Subject: Re: independent scans to same region processed serially Ok, thanks. Are you able to repro easily, or would you like me to put something together? James On Feb 9, 2013, at 9:02 AM, "lars hofhansl" <[EMAIL PROTECTED]> wrote: > I looked through the code. Nothing obvious jumps out. > We can sit together on Monday and run it through a profiler. > > -- Lars > > > > ----- Original Message ----- > From: James Taylor <[EMAIL PROTECTED]> > To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>; lars hofhansl <[EMAIL PROTECTED]> > Cc: > Sent: Friday, February 8, 2013 9:52 PM > Subject: Re: independent scans to same region processed serially > > All data is the blockcache and there are plenty of handlers. To repro, > you could: > - create a table pre-split into, for example, three regions > - execute serially a scan on the middle region > - execute two parallel scans each on half of the middle region > - you'd expect the parallel scan to execute near twice as fast, but > we're seeing it execute slower than the serial scan. > We're using the same HConnection with different HTable instances for > each scan. > > James > > On 02/08/2013 06:51 PM, lars hofhansl wrote: >> Is your data all in the blockcache, otherwise you might have run into HBASE-7336 (https://issues.apache.org/jira/browse/HBASE-7336).Fixed 0.94.4. >> I assume you have enough handlers, etc. (i.e. does the same happen if issue multiple scan request across different region of the same region server?) >> >> >> -- Lars >> >> >> >> ________________________________ >> From: James Taylor <[EMAIL PROTECTED]> >> To: HBase User <[EMAIL PROTECTED]> >> Sent: Friday, February 8, 2013 5:49 PM >> Subject: independent scans to same region processed serially >> >> Wanted to check with folks and see if they've seen an issue around this before digging in deeper. I'm on 0.94.2. If I execute in parallel multiple scans to different parts of the same region, they appear to be processed serially. It's actually faster from the client side to execute a single serial scan than it is to execute multiple parallel scans to different segments of the region. I do have region observer coprocessors for the table I'm scanning, but my code is not doing any synchronization. >> >> Is there a known limitation in this area? Anyone else see anything similar? >> >> James |