|
Farrokh Shahriari
2013-02-01, 13:52
Mohammad Tariq
2013-02-01, 14:02
Farrokh Shahriari
2013-02-01, 14:57
Alexander Ignatov
2013-02-01, 15:07
Jean-Marc Spaggiari
2013-02-01, 15:29
lars hofhansl
2013-02-01, 15:46
Mohammad Tariq
2013-02-01, 16:40
James Taylor
2013-02-01, 17:07
Farrokh Shahriari
2013-02-02, 04:54
|
-
Parallel scan in HBaseFarrokh Shahriari 2013-02-01, 13:52
Hi there
I have two question about scan in Hbase : 1) Does scan operation with specific filter run in parallel on different regionservers ? 2) I wanna know whether this code runs at client side for searching the retrieved results or not ? for (Result result : scanner1) { for (KeyValue kv : result.raw()) { // // some coeds // } } Farrokh Shahriari
-
Re: Parallel scan in HBaseMohammad Tariq 2013-02-01, 14:02
Hello Farrokh,
Scans work sequentially with one region after the other. Scans from client side do not go to regionservers in parallel. And, for the second question, the code will run at the client side. Warm Regards, Tariq https://mtariq.jux.com/ cloudfront.blogspot.com On Fri, Feb 1, 2013 at 7:22 PM, Farrokh Shahriari < [EMAIL PROTECTED]> wrote: > Hi there > I have two question about scan in Hbase : > 1) Does scan operation with specific filter run in parallel on different > regionservers ? > 2) I wanna know whether this code runs at client side for searching the > retrieved results or not ? > > for (Result result : scanner1) { > for (KeyValue kv : result.raw()) { > // > // some coeds > // > } > } > > > Farrokh Shahriari >
-
Re: Parallel scan in HBaseFarrokh Shahriari 2013-02-01, 14:57
Tnx for your reply,
In my case, I should scan all rows( about 1 millions to 5 millions rows) in a table & it takes a long time. I wanna know is there any way I can do it in parallel or not ? On Fri, Feb 1, 2013 at 5:32 PM, Mohammad Tariq <[EMAIL PROTECTED]> wrote: > Hello Farrokh, > > Scans work sequentially with one region after the other. Scans from > client side do not go to regionservers in parallel. And, for the second > question, the code will run at the client side. > > Warm Regards, > Tariq > https://mtariq.jux.com/ > cloudfront.blogspot.com > > > On Fri, Feb 1, 2013 at 7:22 PM, Farrokh Shahriari < > [EMAIL PROTECTED]> wrote: > > > Hi there > > I have two question about scan in Hbase : > > 1) Does scan operation with specific filter run in parallel on different > > regionservers ? > > 2) I wanna know whether this code runs at client side for searching the > > retrieved results or not ? > > > > for (Result result : scanner1) { > > for (KeyValue kv : result.raw()) { > > // > > // some coeds > > // > > } > > } > > > > > > Farrokh Shahriari > > >
-
Re: Parallel scan in HBaseAlexander Ignatov 2013-02-01, 15:07
You could use Coprocessors framework. To do that you have to implement
your own Coprocessors's module and include it to each RegionServers. Here is an introduction article how to use Coprocessors: https://blogs.apache.org/hbase/entry/coprocessor_introduction -- Regards, Alexander Ignatov On 2/1/2013 6:57 PM, Farrokh Shahriari wrote: > Tnx for your reply, > In my case, I should scan all rows( about 1 millions to 5 millions rows) in > a table & it takes a long time. I wanna know is there any way I can do it > in parallel or not ? > > On Fri, Feb 1, 2013 at 5:32 PM, Mohammad Tariq <[EMAIL PROTECTED]> wrote: > >> Hello Farrokh, >> >> Scans work sequentially with one region after the other. Scans from >> client side do not go to regionservers in parallel. And, for the second >> question, the code will run at the client side. >> >> Warm Regards, >> Tariq >> https://mtariq.jux.com/ >> cloudfront.blogspot.com >> >> >> On Fri, Feb 1, 2013 at 7:22 PM, Farrokh Shahriari < >> [EMAIL PROTECTED]> wrote: >> >>> Hi there >>> I have two question about scan in Hbase : >>> 1) Does scan operation with specific filter run in parallel on different >>> regionservers ? >>> 2) I wanna know whether this code runs at client side for searching the >>> retrieved results or not ? >>> >>> for (Result result : scanner1) { >>> for (KeyValue kv : result.raw()) { >>> // >>> // some coeds >>> // >>> } >>> } >>> >>> >>> Farrokh Shahriari >>>
-
Re: Parallel scan in HBaseJean-Marc Spaggiari 2013-02-01, 15:29
MR job is almost doing that.
The map methode is called for each row, and you can have multiple jobs running at the same time. It's the way the rowcounter is working. Scanning every row to count it, but spreading the work over all the nodes... Give it a look. JM 2013/2/1, Alexander Ignatov <[EMAIL PROTECTED]>: > You could use Coprocessors framework. To do that you have to implement > your own Coprocessors's module and include it to each RegionServers. > > Here is an introduction article how to use Coprocessors: > https://blogs.apache.org/hbase/entry/coprocessor_introduction > > -- > Regards, > Alexander Ignatov > > > On 2/1/2013 6:57 PM, Farrokh Shahriari wrote: >> Tnx for your reply, >> In my case, I should scan all rows( about 1 millions to 5 millions rows) >> in >> a table & it takes a long time. I wanna know is there any way I can do it >> in parallel or not ? >> >> On Fri, Feb 1, 2013 at 5:32 PM, Mohammad Tariq <[EMAIL PROTECTED]> >> wrote: >> >>> Hello Farrokh, >>> >>> Scans work sequentially with one region after the other. Scans from >>> client side do not go to regionservers in parallel. And, for the second >>> question, the code will run at the client side. >>> >>> Warm Regards, >>> Tariq >>> https://mtariq.jux.com/ >>> cloudfront.blogspot.com >>> >>> >>> On Fri, Feb 1, 2013 at 7:22 PM, Farrokh Shahriari < >>> [EMAIL PROTECTED]> wrote: >>> >>>> Hi there >>>> I have two question about scan in Hbase : >>>> 1) Does scan operation with specific filter run in parallel on >>>> different >>>> regionservers ? >>>> 2) I wanna know whether this code runs at client side for searching the >>>> retrieved results or not ? >>>> >>>> for (Result result : scanner1) { >>>> for (KeyValue kv : result.raw()) { >>>> // >>>> // some coeds >>>> // >>>> } >>>> } >>>> >>>> >>>> Farrokh Shahriari >>>> > > >
-
Re: Parallel scan in HBaselars hofhansl 2013-02-01, 15:46
The scan contract in HBase is that all rows are returned in order, so all regions have to be traversed in order as well.
It would be nice to add some facility to HBase to performs the scanning in parallel. ________________________________ From: Farrokh Shahriari <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Sent: Friday, February 1, 2013 5:52 AM Subject: Parallel scan in HBase Hi there I have two question about scan in Hbase : 1) Does scan operation with specific filter run in parallel on different regionservers ? 2) I wanna know whether this code runs at client side for searching the retrieved results or not ? for (Result result : scanner1) { for (KeyValue kv : result.raw()) { // // some coeds // } } Farrokh Shahriari
-
Re: Parallel scan in HBaseMohammad Tariq 2013-02-01, 16:40
Do you need to scan each n every row within that range?Or you need specific
rows based on some filter? Warm Regards, Tariq https://mtariq.jux.com/ cloudfront.blogspot.com On Fri, Feb 1, 2013 at 9:16 PM, lars hofhansl <[EMAIL PROTECTED]> wrote: > The scan contract in HBase is that all rows are returned in order, so all > regions have to be traversed in order as well. > It would be nice to add some facility to HBase to performs the scanning in > parallel. > > > > ________________________________ > From: Farrokh Shahriari <[EMAIL PROTECTED]> > To: [EMAIL PROTECTED] > Sent: Friday, February 1, 2013 5:52 AM > Subject: Parallel scan in HBase > > Hi there > I have two question about scan in Hbase : > 1) Does scan operation with specific filter run in parallel on different > regionservers ? > 2) I wanna know whether this code runs at client side for searching the > retrieved results or not ? > > for (Result result : scanner1) { > for (KeyValue kv : result.raw()) { > // > // some coeds > // > } > } > > > Farrokh Shahriari >
-
Re: Parallel scan in HBaseJames Taylor 2013-02-01, 17:07
If you run a SQL query that does aggregation (i.e. uses a built-in
aggregation function like COUNT or does a GROUP BY), Phoenix will orchestrate the running of a set of queries in parallel, segmented along your row key (driven by the start/stop key plus region boundaries). We take advantage of a nifty feature that Lars added where you can pass in your own ExecutorService to an HTable, so you could do something similar. Regards, James On 02/01/2013 08:40 AM, Mohammad Tariq wrote: > Do you need to scan each n every row within that range?Or you need specific > rows based on some filter? > > Warm Regards, > Tariq > https://mtariq.jux.com/ > cloudfront.blogspot.com > > > On Fri, Feb 1, 2013 at 9:16 PM, lars hofhansl <[EMAIL PROTECTED]> wrote: > >> The scan contract in HBase is that all rows are returned in order, so all >> regions have to be traversed in order as well. >> It would be nice to add some facility to HBase to performs the scanning in >> parallel. >> >> >> >> ________________________________ >> From: Farrokh Shahriari <[EMAIL PROTECTED]> >> To: [EMAIL PROTECTED] >> Sent: Friday, February 1, 2013 5:52 AM >> Subject: Parallel scan in HBase >> >> Hi there >> I have two question about scan in Hbase : >> 1) Does scan operation with specific filter run in parallel on different >> regionservers ? >> 2) I wanna know whether this code runs at client side for searching the >> retrieved results or not ? >> >> for (Result result : scanner1) { >> for (KeyValue kv : result.raw()) { >> // >> // some coeds >> // >> } >> } >> >> >> Farrokh Shahriari >> >
-
Re: Parallel scan in HBaseFarrokh Shahriari 2013-02-02, 04:54
Thank you guys,
@Mohammad : Yeah I should retreice all the rows and compare each of them to a specific value. As I understand that Hbase by default doesn't support parallel scan,but I can implement it by my own through Coprocessors & knowing the start/end row key on each region, am I correct ? Farrokh On Fri, Feb 1, 2013 at 8:37 PM, James Taylor <[EMAIL PROTECTED]> wrote: > If you run a SQL query that does aggregation (i.e. uses a built-in > aggregation function like COUNT or does a GROUP BY), Phoenix will > orchestrate the running of a set of queries in parallel, segmented along > your row key (driven by the start/stop key plus region boundaries). We take > advantage of a nifty feature that Lars added where you can pass in your own > ExecutorService to an HTable, so you could do something similar. > > Regards, > > James > > > On 02/01/2013 08:40 AM, Mohammad Tariq wrote: > >> Do you need to scan each n every row within that range?Or you need >> specific >> rows based on some filter? >> >> Warm Regards, >> Tariq >> https://mtariq.jux.com/ >> cloudfront.blogspot.com >> >> >> On Fri, Feb 1, 2013 at 9:16 PM, lars hofhansl <[EMAIL PROTECTED]> wrote: >> >> The scan contract in HBase is that all rows are returned in order, so all >>> regions have to be traversed in order as well. >>> It would be nice to add some facility to HBase to performs the scanning >>> in >>> parallel. >>> >>> >>> >>> ______________________________**__ >>> From: Farrokh Shahriari <[EMAIL PROTECTED]**> >>> To: [EMAIL PROTECTED] >>> Sent: Friday, February 1, 2013 5:52 AM >>> Subject: Parallel scan in HBase >>> >>> Hi there >>> I have two question about scan in Hbase : >>> 1) Does scan operation with specific filter run in parallel on different >>> regionservers ? >>> 2) I wanna know whether this code runs at client side for searching the >>> retrieved results or not ? >>> >>> for (Result result : scanner1) { >>> for (KeyValue kv : result.raw()) { >>> // >>> // some coeds >>> // >>> } >>> } >>> >>> >>> Farrokh Shahriari >>> >>> >> > |