Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> the scan will be executed parallel if not use coprocessor?


Copy link to this message
-
Re: the scan will be executed parallel if not use coprocessor?
The HBase contract guarantees that rows are returned in row order.
That puts limits what can be done in parallel. For example one could farm out the requests to the region servers in parallel, but the client would still have to wait for the rows that sort first and deliver those to the client first.
We could add a new scan option that optionally allows to return rows out of order, in that case the client could deliver the rows as they are retrieved.
In that case care must be taken that the parallel scanner behaves correctly when regions have moved - currently the client scanner know how far it got in the scan, and just resets from there; that part would be a bit more tricky in the parallel case.
-- Lars

----- Original Message -----
From: ramkrishna vasudevan <[EMAIL PROTECTED]>
To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
Cc:
Sent: Sunday, July 14, 2013 9:15 PM
Subject: Re: the scan will be executed parallel if not use coprocessor?

The HBase by default does not use parallel scanning mechanism.  It is
sequential.  There are some JIRA that try to implement scanning in parallel
on the regions.  HBASE-1935 is one such idea.
Projects like phoenix uses Coprocessors to scan the regions in parallel and
the results are returned to the clients.

Regards
Ram
On Mon, Jul 15, 2013 at 7:20 AM, ch huang <[EMAIL PROTECTED]> wrote:

> phoenix is using coprocessor internal
>
> On Sun, Jul 14, 2013 at 11:15 PM, Asaf Mesika <[EMAIL PROTECTED]>
> wrote:
>
> > To my knowledge, scan is not parallel, hence the speed of queries of
> > Impala, Phoenix, and other similar projects.
> >
> > On Saturday, July 13, 2013, ch huang wrote:
> >
> > > hi ted ,for example i have a table with 10 regions, if i offer the
> > > condition hit the data of 8 regions,is it different do it use oraginal
> > scan
> > > and use coprocessor? i know coprocessor can do it parallel for each
> > region
> > > ,but why the oraginal scan will slow than coprocessor?
> > >
> > >
> > >
> > > On Sat, Jul 13, 2013 at 7:36 PM, Ted Yu <[EMAIL PROTECTED]
> > <javascript:;>>
> > > wrote:
> > >
> > > > Can you clarify your question a little bit ?
> > > >
> > > > That is, are you expecting parallel scan within region boundary or
> > across
> > > > boundaries ?
> > > >
> > > > Cheers
> > > >
> > > > On Jul 13, 2013, at 1:43 AM, ch huang <[EMAIL PROTECTED]
> > <javascript:;>>
> > > wrote:
> > > >
> > > > > ATT
> > > >
> > >
> >
> >
> > --
> > Sent from Gmail Mobile
> >
>