Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> the scan will be executed parallel if not use coprocessor?


Copy link to this message
-
Re: the scan will be executed parallel if not use coprocessor?
Yes it may be good to visit HBASE-1935 ..

Whether or not CP Observers (pre/post hooks) are used or not, the scanning
is sequential from HBase client side. Phoenix having their own client side
code to make mutiple parallel scan requests to servers. (splitting the scan
range)

We have Endpoints. The execution of this from client side will be
parallel.

Just said to make it clear.

-Anoop-

On Tue, Jul 16, 2013 at 12:28 AM, lars hofhansl <[EMAIL PROTECTED]> wrote:

> The HBase contract guarantees that rows are returned in row order.
> That puts limits what can be done in parallel. For example one could farm
> out the requests to the region servers in parallel, but the client would
> still have to wait for the rows that sort first and deliver those to the
> client first.
> We could add a new scan option that optionally allows to return rows out
> of order, in that case the client could deliver the rows as they are
> retrieved.
> In that case care must be taken that the parallel scanner behaves
> correctly when regions have moved - currently the client scanner know how
> far it got in the scan, and just resets from there; that part would be a
> bit more tricky in the parallel case.
>
>
> -- Lars
>
>
>
> ----- Original Message -----
> From: ramkrishna vasudevan <[EMAIL PROTECTED]>
> To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
> Cc:
> Sent: Sunday, July 14, 2013 9:15 PM
> Subject: Re: the scan will be executed parallel if not use coprocessor?
>
> The HBase by default does not use parallel scanning mechanism.  It is
> sequential.  There are some JIRA that try to implement scanning in parallel
> on the regions.  HBASE-1935 is one such idea.
> Projects like phoenix uses Coprocessors to scan the regions in parallel and
> the results are returned to the clients.
>
> Regards
> Ram
>
>
> On Mon, Jul 15, 2013 at 7:20 AM, ch huang <[EMAIL PROTECTED]> wrote:
>
> > phoenix is using coprocessor internal
> >
> > On Sun, Jul 14, 2013 at 11:15 PM, Asaf Mesika <[EMAIL PROTECTED]>
> > wrote:
> >
> > > To my knowledge, scan is not parallel, hence the speed of queries of
> > > Impala, Phoenix, and other similar projects.
> > >
> > > On Saturday, July 13, 2013, ch huang wrote:
> > >
> > > > hi ted ,for example i have a table with 10 regions, if i offer the
> > > > condition hit the data of 8 regions,is it different do it use
> oraginal
> > > scan
> > > > and use coprocessor? i know coprocessor can do it parallel for each
> > > region
> > > > ,but why the oraginal scan will slow than coprocessor?
> > > >
> > > >
> > > >
> > > > On Sat, Jul 13, 2013 at 7:36 PM, Ted Yu <[EMAIL PROTECTED]
> > > <javascript:;>>
> > > > wrote:
> > > >
> > > > > Can you clarify your question a little bit ?
> > > > >
> > > > > That is, are you expecting parallel scan within region boundary or
> > > across
> > > > > boundaries ?
> > > > >
> > > > > Cheers
> > > > >
> > > > > On Jul 13, 2013, at 1:43 AM, ch huang <[EMAIL PROTECTED]
> > > <javascript:;>>
> > > > wrote:
> > > > >
> > > > > > ATT
> > > > >
> > > >
> > >
> > >
> > > --
> > > Sent from Gmail Mobile
> > >
> >
>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB