|
Kimdhamilton
2013-03-06, 04:54
Kim Hamilton
2013-03-08, 01:02
Gary Helmling
2013-03-08, 01:34
Andrew Purtell
2013-03-08, 01:35
Andrew Purtell
2013-03-08, 01:13
Kim Hamilton
2013-03-05, 01:14
Andrew Purtell
2013-03-05, 01:43
Andrew Purtell
2013-03-05, 02:05
James Taylor
2013-03-05, 01:58
Gary Helmling
2013-03-05, 02:23
Gary Helmling
2013-03-05, 02:30
Stephen Boesch
2013-03-05, 04:08
Kim Hamilton
2013-03-05, 21:13
Andrew Purtell
2013-03-06, 01:58
Anoop Sam John
2013-03-06, 03:14
Gary Helmling
2013-03-05, 01:42
|
-
RE: endpoint coprocessor performanceKimdhamilton 2013-03-06, 04:54
Yes, definitely. I'm following up tomorrow with more testing and will report back. I'm definitely seeing significant load on .META. but want to see what I can determine about the root cause
Sent from my Samsung smartphone on AT&T -------- Original message -------- Subject: RE: endpoint coprocessor performance From: Anoop Sam John <[EMAIL PROTECTED]> To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> CC: Yes agree with Andrew here... I checked the 94 code base yday. I also feel that the efficiency should be on the higher side.. And there is no whole table scan. The HBase client issues scan for only those regions which come under the start/stop keys that app specified. Yes it is contacting .META. to know the regions coming within the start/stop rows. But that should not be a big efficiency issue IMHO also. @Kim - Can you do some profiling and let us know which area of code is eating up time in your case? HBASE-6877 also I am seeing. -Anoop- ________________________________________ From: Andrew Purtell [[EMAIL PROTECTED]] Sent: Wednesday, March 06, 2013 7:28 AM To: [EMAIL PROTECTED] Subject: Re: endpoint coprocessor performance > In current logic, HTable#coprocessorExec always scan the whole table, its efficiency is low No, I don't think that is correct. In its current logic, coprocessorExec always scans the META table for all regions of the target table, to find the up to date locations, and then dispatches the exec in parallel to all regions of the target table. The efficiency of the exec is actually high because invocations happen in parallel across the cluster, with results reassembled back at the client as they come in. The increased setup latency relative to a Scan and the load on META is because of the initial scan on META to find the up to date locations of all regions of the target table. For a Scan, the cached locations of regions are used, and relocations are handled transparently by the client. Exec could be updated to do this as well. On Wed, Mar 6, 2013 at 5:13 AM, Kim Hamilton <[EMAIL PROTECTED]> wrote: > Thanks so much! This describes exactly what I'm seeing. I did notice > extremely heavy load on the region server carrying .META., as described in > HBASE-6870: > > In current logic, HTable#coprocessorExec always scan the whole table, > its efficiency > is low and will affect the Regionserver carrying .META. under large > coprocessorExec requests > > > Thanks again, > Kim > On Mon, Mar 4, 2013 at 8:08 PM, Stephen Boesch <[EMAIL PROTECTED]> wrote: > > > great question from Kim and follow-up/answers. > > > > > > 2013/3/4 Gary Helmling <[EMAIL PROTECTED]> > > > > > I see this is HBASE-6870. I thought that sounded familiar. > > > > > > > > > On Mon, Mar 4, 2013 at 6:23 PM, Gary Helmling <[EMAIL PROTECTED]> > > wrote: > > > > > > > > > > > Check your logs for whether your end-point coprocessor is hitting > > > >> zookeeper on every invocation to figure out the region start key. > > > >> Unfortunately (at least last time I checked), the default way of > > > invoking > > > >> an end point coprocessor doesn't use the meta cache. You can go > > through > > > a > > > >> combination of the following instead: > > > >> HRegionLocation regionLocation = retried ? > > > >> connection.relocateRegion(**tableName, tableKey) : > > > >> connection.locateRegion(**tableName, tableKey); > > > >> ... > > > >> Then call HConnection.processExecs call, passing in the regionKeys > > from > > > >> above. > > > >> You can trap the error case of the region being relocated and try > > again > > > >> with retried = true and it'll update the meta data cache when > > > >> relocateRegion is called. > > > >> > > > > > > > > > > > > Any idea if we have an improvement logged in JIRA for this? This is > > > > definitely something we should improve on. > > > > > > > > > > -- Best regards, - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White) +
Kimdhamilton 2013-03-06, 04:54
-
Re: endpoint coprocessor performanceKim Hamilton 2013-03-08, 01:02
I profiled it and getStartKeysInRange is taking all the time. Recall I'm
running 0.92.1. I think these factors are consistent with https://issues.apache.org/jira/browse/HBASE-5492, which was fixed in 0.92.3. We'll be upgrading soon, so I'll be able to verify the perf issue is gone. Thanks for the help everyone! On Tue, Mar 5, 2013 at 8:54 PM, Kimdhamilton <[EMAIL PROTECTED]> wrote: > Yes, definitely. I'm following up tomorrow with more testing and will > report back. I'm definitely seeing significant load on .META. but want to > see what I can determine about the root cause > > > Sent from my Samsung smartphone on AT&T > > > -------- Original message -------- > Subject: RE: endpoint coprocessor performance > From: Anoop Sam John <[EMAIL PROTECTED]> > To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> > CC: > > > Yes agree with Andrew here... I checked the 94 code base yday. I also > feel that the efficiency should be on the higher side.. And there is no > whole table scan. The HBase client issues scan for only those regions which > come under the start/stop keys that app specified. Yes it is contacting > .META. to know the regions coming within the start/stop rows. But that > should not be a big efficiency issue IMHO also. > > @Kim - Can you do some profiling and let us know which area of code is > eating up time in your case? > > HBASE-6877 also I am seeing. > > -Anoop- > ________________________________________ > From: Andrew Purtell [[EMAIL PROTECTED]] > Sent: Wednesday, March 06, 2013 7:28 AM > To: [EMAIL PROTECTED] > Subject: Re: endpoint coprocessor performance > > > In current logic, HTable#coprocessorExec always scan the whole table, its > efficiency is low > > No, I don't think that is correct. > > In its current logic, coprocessorExec always scans the META table for all > regions of the target table, to find the up to date locations, and then > dispatches the exec in parallel to all regions of the target table. The > efficiency of the exec is actually high because invocations happen in > parallel across the cluster, with results reassembled back at the client as > they come in. > > The increased setup latency relative to a Scan and the load on META is > because of the initial scan on META to find the up to date locations of all > regions of the target table. For a Scan, the cached locations of regions > are used, and relocations are handled transparently by the client. Exec > could be updated to do this as well. > > > > > On Wed, Mar 6, 2013 at 5:13 AM, Kim Hamilton <[EMAIL PROTECTED]> > wrote: > > > Thanks so much! This describes exactly what I'm seeing. I did notice > > extremely heavy load on the region server carrying .META., as described > in > > HBASE-6870: > > > > In current logic, HTable#coprocessorExec always scan the whole table, > > its efficiency > > is low and will affect the Regionserver carrying .META. under large > > coprocessorExec requests > > > > > > Thanks again, > > Kim > > On Mon, Mar 4, 2013 at 8:08 PM, Stephen Boesch <[EMAIL PROTECTED]> > wrote: > > > > > great question from Kim and follow-up/answers. > > > > > > > > > 2013/3/4 Gary Helmling <[EMAIL PROTECTED]> > > > > > > > I see this is HBASE-6870. I thought that sounded familiar. > > > > > > > > > > > > On Mon, Mar 4, 2013 at 6:23 PM, Gary Helmling <[EMAIL PROTECTED]> > > > wrote: > > > > > > > > > > > > > > Check your logs for whether your end-point coprocessor is hitting > > > > >> zookeeper on every invocation to figure out the region start key. > > > > >> Unfortunately (at least last time I checked), the default way of > > > > invoking > > > > >> an end point coprocessor doesn't use the meta cache. You can go > > > through > > > > a > > > > >> combination of the following instead: > > > > >> HRegionLocation regionLocation = retried ? > > > > >> connection.relocateRegion(**tableName, tableKey) : > > > > >> connection.locateRegion(**tableName, tableKey); > > > > >> ... > > > > >> Then call HConnection.processExecs call, passing in the regionKeys +
Kim Hamilton 2013-03-08, 01:02
-
Re: endpoint coprocessor performanceGary Helmling 2013-03-08, 01:34
> I profiled it and getStartKeysInRange is taking all the time. Recall I'm
> running 0.92.1. I think these factors are consistent with > https://issues.apache.org/jira/browse/HBASE-5492, which was fixed in > 0.92.3. > > We'll be upgrading soon, so I'll be able to verify the perf issue is gone. > Unfortunately it doesn't look like that issue was ever resolved, so the fix version of 0.92.3 is not accurate. I cleared the fix version to avoid future confusion. In any case, it looks like the same issue described in HBASE-6870, so if we can get that in, it should solve your problem. +
Gary Helmling 2013-03-08, 01:34
-
Re: endpoint coprocessor performanceAndrew Purtell 2013-03-08, 01:35
> In any case, it looks like the same issue described in HBASE-6870, so if
we can get that in, it should solve your problem. So should we close HBASE-5492 as a dup? On Fri, Mar 8, 2013 at 9:34 AM, Gary Helmling <[EMAIL PROTECTED]> wrote: > > I profiled it and getStartKeysInRange is taking all the time. Recall I'm > > running 0.92.1. I think these factors are consistent with > > https://issues.apache.org/jira/browse/HBASE-5492, which was fixed in > > 0.92.3. > > > > We'll be upgrading soon, so I'll be able to verify the perf issue is > gone. > > > > Unfortunately it doesn't look like that issue was ever resolved, so the fix > version of 0.92.3 is not accurate. I cleared the fix version to avoid > future confusion. > > In any case, it looks like the same issue described in HBASE-6870, so if we > can get that in, it should solve your problem. > -- Best regards, - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White) +
Andrew Purtell 2013-03-08, 01:35
-
Re: endpoint coprocessor performanceAndrew Purtell 2013-03-08, 01:13
Thanks for reporting back!
On Fri, Mar 8, 2013 at 9:02 AM, Kim Hamilton <[EMAIL PROTECTED]> wrote: > I profiled it and getStartKeysInRange is taking all the time. Recall I'm > running 0.92.1. I think these factors are consistent with > https://issues.apache.org/jira/browse/HBASE-5492, which was fixed in > 0.92.3. > > We'll be upgrading soon, so I'll be able to verify the perf issue is gone. > > Thanks for the help everyone! > > > On Tue, Mar 5, 2013 at 8:54 PM, Kimdhamilton <[EMAIL PROTECTED]> > wrote: > > > Yes, definitely. I'm following up tomorrow with more testing and will > > report back. I'm definitely seeing significant load on .META. but want to > > see what I can determine about the root cause > > > > > > Sent from my Samsung smartphone on AT&T > > > > > > -------- Original message -------- > > Subject: RE: endpoint coprocessor performance > > From: Anoop Sam John <[EMAIL PROTECTED]> > > To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> > > CC: > > > > > > Yes agree with Andrew here... I checked the 94 code base yday. I also > > feel that the efficiency should be on the higher side.. And there is no > > whole table scan. The HBase client issues scan for only those regions > which > > come under the start/stop keys that app specified. Yes it is contacting > > .META. to know the regions coming within the start/stop rows. But that > > should not be a big efficiency issue IMHO also. > > > > @Kim - Can you do some profiling and let us know which area of code is > > eating up time in your case? > > > > HBASE-6877 also I am seeing. > > > > -Anoop- > > ________________________________________ > > From: Andrew Purtell [[EMAIL PROTECTED]] > > Sent: Wednesday, March 06, 2013 7:28 AM > > To: [EMAIL PROTECTED] > > Subject: Re: endpoint coprocessor performance > > > > > In current logic, HTable#coprocessorExec always scan the whole table, > its > > efficiency is low > > > > No, I don't think that is correct. > > > > In its current logic, coprocessorExec always scans the META table for all > > regions of the target table, to find the up to date locations, and then > > dispatches the exec in parallel to all regions of the target table. The > > efficiency of the exec is actually high because invocations happen in > > parallel across the cluster, with results reassembled back at the client > as > > they come in. > > > > The increased setup latency relative to a Scan and the load on META is > > because of the initial scan on META to find the up to date locations of > all > > regions of the target table. For a Scan, the cached locations of regions > > are used, and relocations are handled transparently by the client. Exec > > could be updated to do this as well. > > > > > > > > > > On Wed, Mar 6, 2013 at 5:13 AM, Kim Hamilton <[EMAIL PROTECTED]> > > wrote: > > > > > Thanks so much! This describes exactly what I'm seeing. I did notice > > > extremely heavy load on the region server carrying .META., as described > > in > > > HBASE-6870: > > > > > > In current logic, HTable#coprocessorExec always scan the whole table, > > > its efficiency > > > is low and will affect the Regionserver carrying .META. under large > > > coprocessorExec requests > > > > > > > > > Thanks again, > > > Kim > > > On Mon, Mar 4, 2013 at 8:08 PM, Stephen Boesch <[EMAIL PROTECTED]> > > wrote: > > > > > > > great question from Kim and follow-up/answers. > > > > > > > > > > > > 2013/3/4 Gary Helmling <[EMAIL PROTECTED]> > > > > > > > > > I see this is HBASE-6870. I thought that sounded familiar. > > > > > > > > > > > > > > > On Mon, Mar 4, 2013 at 6:23 PM, Gary Helmling <[EMAIL PROTECTED] > > > > > > wrote: > > > > > > > > > > > > > > > > > Check your logs for whether your end-point coprocessor is hitting > > > > > >> zookeeper on every invocation to figure out the region start > key. > > > > > >> Unfortunately (at least last time I checked), the default way of > > > > > invoking > > > > > >> an end point coprocessor doesn't use the meta cache. You can go Best regards, - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White) +
Andrew Purtell 2013-03-08, 01:13
-
endpoint coprocessor performanceKim Hamilton 2013-03-05, 01:14
Hi all,
I've been lurking here for a while, so thanks for all the valuable tips and guidance you've given so far. I'm running some experiments to understand where to use coprocessors. One interesting scenario is computing distinct values. I ran performance tests with two distinct value implementations: one using endpoint coprocessors, and one using just scans (computing distinct values client side only). I noticed that the endpoint coprocessor implementation averaged 80 ms slower than the scan implementation. Details of that are below for anyone interested. To drill into the performance, I instrumented the code and ultimately deployed a no-op endpoint coprocessor, to look at the overhead of simply calling it. I'm measuring around 100ms for calling my empty, no-op endpoint coprocessor. I need to do more tests, but I believe my tests are leading me to similar conclusions drawn here: http://hbase-coprocessor-experiments.blogspot.com/2011/05/extending.html I.e. if the query/scan is selective enough (I'll go out on a limb and estimate 50-100 rows), then it's better to just perform a scan and compute client side. Endpoint coprocessors will make sense for larger result sets and/or scans that hit multiple regions. Before going too far, I wanted to check if anyone in this group has suggestions. I.e. perhaps there are just some configuration options I've not uncovered. Does this 100ms latency sound correct? Thanks, Kim *Detailed results of distinct value comparison, just FYI* Using 0.92.1-cdh4.1.0 Scan result size ~50-100 Row size 1kb, but after filtering for only desired columns, 380 bytes *with coprocessors* AverageLatency(ms), 176.1353 MinLatency(ms), 42 MaxLatency(ms), 2368 95thPercentileLatency(ms), 321 99thPercentileLatency(ms), 422 *scan-only, compute distinct values client side* AverageLatency(ms), 92.8165 MinLatency(ms), 4 MaxLatency(ms), 986 95thPercentileLatency(ms), 253 99thPercentileLatency(ms), 356 +
Kim Hamilton 2013-03-05, 01:14
-
Re: endpoint coprocessor performanceAndrew Purtell 2013-03-05, 01:43
Do you have timing results for an Apache HBase release? Our last release
was 0.94.5. On Tuesday, March 5, 2013, Kim Hamilton wrote: > Hi all, > I've been lurking here for a while, so thanks for all the valuable tips and > guidance you've given so far. > > I'm running some experiments to understand where to use coprocessors. One > interesting scenario is computing distinct values. I ran performance tests > with two distinct value implementations: one using endpoint coprocessors, > and one using just scans (computing distinct values client side only). I > noticed that the endpoint coprocessor implementation averaged 80 ms slower > than the scan implementation. Details of that are below for anyone > interested. > > To drill into the performance, I instrumented the code and ultimately > deployed a no-op endpoint coprocessor, to look at the overhead of simply > calling it. I'm measuring around 100ms for calling my empty, no-op endpoint > coprocessor. > > I need to do more tests, but I believe my tests are leading me to similar > conclusions drawn here: > http://hbase-coprocessor-experiments.blogspot.com/2011/05/extending.html > > I.e. if the query/scan is selective enough (I'll go out on a limb and > estimate 50-100 rows), then it's better to just perform a scan and compute > client side. Endpoint coprocessors will make sense for larger result sets > and/or scans that hit multiple regions. > > Before going too far, I wanted to check if anyone in this group has > suggestions. I.e. perhaps there are just some configuration options I've > not uncovered. Does this 100ms latency sound correct? > > Thanks, > Kim > > > *Detailed results of distinct value comparison, just FYI* > > Using 0.92.1-cdh4.1.0 > Scan result size ~50-100 > Row size 1kb, but after filtering for only desired columns, 380 bytes > > *with coprocessors* > AverageLatency(ms), 176.1353 > MinLatency(ms), 42 > MaxLatency(ms), 2368 > 95thPercentileLatency(ms), 321 > 99thPercentileLatency(ms), 422 > > *scan-only, compute distinct values client side* > AverageLatency(ms), 92.8165 > MinLatency(ms), 4 > MaxLatency(ms), 986 > 95thPercentileLatency(ms), 253 > 99thPercentileLatency(ms), 356 > -- Best regards, - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White) +
Andrew Purtell 2013-03-05, 01:43
-
Re: endpoint coprocessor performanceAndrew Purtell 2013-03-05, 02:05
Please disregard. James may have nailed it and that's not version
dependent. On Tuesday, March 5, 2013, Andrew Purtell wrote: > Do you have timing results for an Apache HBase release? Our last release > was 0.94.5. > > On Tuesday, March 5, 2013, Kim Hamilton wrote: > >> Hi all, >> I've been lurking here for a while, so thanks for all the valuable tips >> and >> guidance you've given so far. >> >> I'm running some experiments to understand where to use coprocessors. One >> interesting scenario is computing distinct values. I ran performance tests >> with two distinct value implementations: one using endpoint coprocessors, >> and one using just scans (computing distinct values client side only). I >> noticed that the endpoint coprocessor implementation averaged 80 ms slower >> than the scan implementation. Details of that are below for anyone >> interested. >> >> To drill into the performance, I instrumented the code and ultimately >> deployed a no-op endpoint coprocessor, to look at the overhead of simply >> calling it. I'm measuring around 100ms for calling my empty, no-op >> endpoint >> coprocessor. >> >> I need to do more tests, but I believe my tests are leading me to similar >> conclusions drawn here: >> http://hbase-coprocessor-experiments.blogspot.com/2011/05/extending.html >> >> I.e. if the query/scan is selective enough (I'll go out on a limb and >> estimate 50-100 rows), then it's better to just perform a scan and compute >> client side. Endpoint coprocessors will make sense for larger result sets >> and/or scans that hit multiple regions. >> >> Before going too far, I wanted to check if anyone in this group has >> suggestions. I.e. perhaps there are just some configuration options I've >> not uncovered. Does this 100ms latency sound correct? >> >> Thanks, >> Kim >> >> >> *Detailed results of distinct value comparison, just FYI* >> >> Using 0.92.1-cdh4.1.0 >> Scan result size ~50-100 >> Row size 1kb, but after filtering for only desired columns, 380 bytes >> >> *with coprocessors* >> AverageLatency(ms), 176.1353 >> MinLatency(ms), 42 >> MaxLatency(ms), 2368 >> 95thPercentileLatency(ms), 321 >> 99thPercentileLatency(ms), 422 >> >> *scan-only, compute distinct values client side* >> AverageLatency(ms), 92.8165 >> MinLatency(ms), 4 >> MaxLatency(ms), 986 >> 95thPercentileLatency(ms), 253 >> 99thPercentileLatency(ms), 356 >> > > > -- > Best regards, > > - Andy > > Problems worthy of attack prove their worth by hitting back. - Piet Hein > (via Tom White) > > -- Best regards, - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White) +
Andrew Purtell 2013-03-05, 02:05
-
Re: endpoint coprocessor performanceJames Taylor 2013-03-05, 01:58
Check your logs for whether your end-point coprocessor is hitting
zookeeper on every invocation to figure out the region start key. Unfortunately (at least last time I checked), the default way of invoking an end point coprocessor doesn't use the meta cache. You can go through a combination of the following instead: HRegionLocation regionLocation = retried ? connection.relocateRegion(tableName, tableKey) : connection.locateRegion(tableName, tableKey); ... Then call HConnection.processExecs call, passing in the regionKeys from above. You can trap the error case of the region being relocated and try again with retried = true and it'll update the meta data cache when relocateRegion is called. Once we made this change for Phoenix, our latencies went way down. HTH, James On 03/04/2013 05:43 PM, Andrew Purtell wrote: > Do you have timing results for an Apache HBase release? Our last release > was 0.94.5. > > On Tuesday, March 5, 2013, Kim Hamilton wrote: > >> Hi all, >> I've been lurking here for a while, so thanks for all the valuable tips and >> guidance you've given so far. >> >> I'm running some experiments to understand where to use coprocessors. One >> interesting scenario is computing distinct values. I ran performance tests >> with two distinct value implementations: one using endpoint coprocessors, >> and one using just scans (computing distinct values client side only). I >> noticed that the endpoint coprocessor implementation averaged 80 ms slower >> than the scan implementation. Details of that are below for anyone >> interested. >> >> To drill into the performance, I instrumented the code and ultimately >> deployed a no-op endpoint coprocessor, to look at the overhead of simply >> calling it. I'm measuring around 100ms for calling my empty, no-op endpoint >> coprocessor. >> >> I need to do more tests, but I believe my tests are leading me to similar >> conclusions drawn here: >> http://hbase-coprocessor-experiments.blogspot.com/2011/05/extending.html >> >> I.e. if the query/scan is selective enough (I'll go out on a limb and >> estimate 50-100 rows), then it's better to just perform a scan and compute >> client side. Endpoint coprocessors will make sense for larger result sets >> and/or scans that hit multiple regions. >> >> Before going too far, I wanted to check if anyone in this group has >> suggestions. I.e. perhaps there are just some configuration options I've >> not uncovered. Does this 100ms latency sound correct? >> >> Thanks, >> Kim >> >> >> *Detailed results of distinct value comparison, just FYI* >> >> Using 0.92.1-cdh4.1.0 >> Scan result size ~50-100 >> Row size 1kb, but after filtering for only desired columns, 380 bytes >> >> *with coprocessors* >> AverageLatency(ms), 176.1353 >> MinLatency(ms), 42 >> MaxLatency(ms), 2368 >> 95thPercentileLatency(ms), 321 >> 99thPercentileLatency(ms), 422 >> >> *scan-only, compute distinct values client side* >> AverageLatency(ms), 92.8165 >> MinLatency(ms), 4 >> MaxLatency(ms), 986 >> 95thPercentileLatency(ms), 253 >> 99thPercentileLatency(ms), 356 >> > +
James Taylor 2013-03-05, 01:58
-
Re: endpoint coprocessor performanceGary Helmling 2013-03-05, 02:23
> Check your logs for whether your end-point coprocessor is hitting
> zookeeper on every invocation to figure out the region start key. > Unfortunately (at least last time I checked), the default way of invoking > an end point coprocessor doesn't use the meta cache. You can go through a > combination of the following instead: > HRegionLocation regionLocation = retried ? > connection.relocateRegion(**tableName, tableKey) : > connection.locateRegion(**tableName, tableKey); > ... > Then call HConnection.processExecs call, passing in the regionKeys from > above. > You can trap the error case of the region being relocated and try again > with retried = true and it'll update the meta data cache when > relocateRegion is called. > Any idea if we have an improvement logged in JIRA for this? This is definitely something we should improve on. +
Gary Helmling 2013-03-05, 02:23
-
Re: endpoint coprocessor performanceGary Helmling 2013-03-05, 02:30
I see this is HBASE-6870. I thought that sounded familiar.
On Mon, Mar 4, 2013 at 6:23 PM, Gary Helmling <[EMAIL PROTECTED]> wrote: > > Check your logs for whether your end-point coprocessor is hitting >> zookeeper on every invocation to figure out the region start key. >> Unfortunately (at least last time I checked), the default way of invoking >> an end point coprocessor doesn't use the meta cache. You can go through a >> combination of the following instead: >> HRegionLocation regionLocation = retried ? >> connection.relocateRegion(**tableName, tableKey) : >> connection.locateRegion(**tableName, tableKey); >> ... >> Then call HConnection.processExecs call, passing in the regionKeys from >> above. >> You can trap the error case of the region being relocated and try again >> with retried = true and it'll update the meta data cache when >> relocateRegion is called. >> > > > Any idea if we have an improvement logged in JIRA for this? This is > definitely something we should improve on. > +
Gary Helmling 2013-03-05, 02:30
-
Re: endpoint coprocessor performanceStephen Boesch 2013-03-05, 04:08
great question from Kim and follow-up/answers.
2013/3/4 Gary Helmling <[EMAIL PROTECTED]> > I see this is HBASE-6870. I thought that sounded familiar. > > > On Mon, Mar 4, 2013 at 6:23 PM, Gary Helmling <[EMAIL PROTECTED]> wrote: > > > > > Check your logs for whether your end-point coprocessor is hitting > >> zookeeper on every invocation to figure out the region start key. > >> Unfortunately (at least last time I checked), the default way of > invoking > >> an end point coprocessor doesn't use the meta cache. You can go through > a > >> combination of the following instead: > >> HRegionLocation regionLocation = retried ? > >> connection.relocateRegion(**tableName, tableKey) : > >> connection.locateRegion(**tableName, tableKey); > >> ... > >> Then call HConnection.processExecs call, passing in the regionKeys from > >> above. > >> You can trap the error case of the region being relocated and try again > >> with retried = true and it'll update the meta data cache when > >> relocateRegion is called. > >> > > > > > > Any idea if we have an improvement logged in JIRA for this? This is > > definitely something we should improve on. > > > +
Stephen Boesch 2013-03-05, 04:08
-
Re: endpoint coprocessor performanceKim Hamilton 2013-03-05, 21:13
Thanks so much! This describes exactly what I'm seeing. I did notice
extremely heavy load on the region server carrying .META., as described in HBASE-6870: In current logic, HTable#coprocessorExec always scan the whole table, its efficiency is low and will affect the Regionserver carrying .META. under large coprocessorExec requests Thanks again, Kim On Mon, Mar 4, 2013 at 8:08 PM, Stephen Boesch <[EMAIL PROTECTED]> wrote: > great question from Kim and follow-up/answers. > > > 2013/3/4 Gary Helmling <[EMAIL PROTECTED]> > > > I see this is HBASE-6870. I thought that sounded familiar. > > > > > > On Mon, Mar 4, 2013 at 6:23 PM, Gary Helmling <[EMAIL PROTECTED]> > wrote: > > > > > > > > Check your logs for whether your end-point coprocessor is hitting > > >> zookeeper on every invocation to figure out the region start key. > > >> Unfortunately (at least last time I checked), the default way of > > invoking > > >> an end point coprocessor doesn't use the meta cache. You can go > through > > a > > >> combination of the following instead: > > >> HRegionLocation regionLocation = retried ? > > >> connection.relocateRegion(**tableName, tableKey) : > > >> connection.locateRegion(**tableName, tableKey); > > >> ... > > >> Then call HConnection.processExecs call, passing in the regionKeys > from > > >> above. > > >> You can trap the error case of the region being relocated and try > again > > >> with retried = true and it'll update the meta data cache when > > >> relocateRegion is called. > > >> > > > > > > > > > Any idea if we have an improvement logged in JIRA for this? This is > > > definitely something we should improve on. > > > > > > +
Kim Hamilton 2013-03-05, 21:13
-
Re: endpoint coprocessor performanceAndrew Purtell 2013-03-06, 01:58
> In current logic, HTable#coprocessorExec always scan the whole table, its
efficiency is low No, I don't think that is correct. In its current logic, coprocessorExec always scans the META table for all regions of the target table, to find the up to date locations, and then dispatches the exec in parallel to all regions of the target table. The efficiency of the exec is actually high because invocations happen in parallel across the cluster, with results reassembled back at the client as they come in. The increased setup latency relative to a Scan and the load on META is because of the initial scan on META to find the up to date locations of all regions of the target table. For a Scan, the cached locations of regions are used, and relocations are handled transparently by the client. Exec could be updated to do this as well. On Wed, Mar 6, 2013 at 5:13 AM, Kim Hamilton <[EMAIL PROTECTED]> wrote: > Thanks so much! This describes exactly what I'm seeing. I did notice > extremely heavy load on the region server carrying .META., as described in > HBASE-6870: > > In current logic, HTable#coprocessorExec always scan the whole table, > its efficiency > is low and will affect the Regionserver carrying .META. under large > coprocessorExec requests > > > Thanks again, > Kim > On Mon, Mar 4, 2013 at 8:08 PM, Stephen Boesch <[EMAIL PROTECTED]> wrote: > > > great question from Kim and follow-up/answers. > > > > > > 2013/3/4 Gary Helmling <[EMAIL PROTECTED]> > > > > > I see this is HBASE-6870. I thought that sounded familiar. > > > > > > > > > On Mon, Mar 4, 2013 at 6:23 PM, Gary Helmling <[EMAIL PROTECTED]> > > wrote: > > > > > > > > > > > Check your logs for whether your end-point coprocessor is hitting > > > >> zookeeper on every invocation to figure out the region start key. > > > >> Unfortunately (at least last time I checked), the default way of > > > invoking > > > >> an end point coprocessor doesn't use the meta cache. You can go > > through > > > a > > > >> combination of the following instead: > > > >> HRegionLocation regionLocation = retried ? > > > >> connection.relocateRegion(**tableName, tableKey) : > > > >> connection.locateRegion(**tableName, tableKey); > > > >> ... > > > >> Then call HConnection.processExecs call, passing in the regionKeys > > from > > > >> above. > > > >> You can trap the error case of the region being relocated and try > > again > > > >> with retried = true and it'll update the meta data cache when > > > >> relocateRegion is called. > > > >> > > > > > > > > > > > > Any idea if we have an improvement logged in JIRA for this? This is > > > > definitely something we should improve on. > > > > > > > > > > -- Best regards, - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White) +
Andrew Purtell 2013-03-06, 01:58
-
RE: endpoint coprocessor performanceAnoop Sam John 2013-03-06, 03:14
Yes agree with Andrew here... I checked the 94 code base yday. I also feel that the efficiency should be on the higher side.. And there is no whole table scan. The HBase client issues scan for only those regions which come under the start/stop keys that app specified. Yes it is contacting .META. to know the regions coming within the start/stop rows. But that should not be a big efficiency issue IMHO also.
@Kim - Can you do some profiling and let us know which area of code is eating up time in your case? HBASE-6877 also I am seeing. -Anoop- ________________________________________ From: Andrew Purtell [[EMAIL PROTECTED]] Sent: Wednesday, March 06, 2013 7:28 AM To: [EMAIL PROTECTED] Subject: Re: endpoint coprocessor performance > In current logic, HTable#coprocessorExec always scan the whole table, its efficiency is low No, I don't think that is correct. In its current logic, coprocessorExec always scans the META table for all regions of the target table, to find the up to date locations, and then dispatches the exec in parallel to all regions of the target table. The efficiency of the exec is actually high because invocations happen in parallel across the cluster, with results reassembled back at the client as they come in. The increased setup latency relative to a Scan and the load on META is because of the initial scan on META to find the up to date locations of all regions of the target table. For a Scan, the cached locations of regions are used, and relocations are handled transparently by the client. Exec could be updated to do this as well. On Wed, Mar 6, 2013 at 5:13 AM, Kim Hamilton <[EMAIL PROTECTED]> wrote: > Thanks so much! This describes exactly what I'm seeing. I did notice > extremely heavy load on the region server carrying .META., as described in > HBASE-6870: > > In current logic, HTable#coprocessorExec always scan the whole table, > its efficiency > is low and will affect the Regionserver carrying .META. under large > coprocessorExec requests > > > Thanks again, > Kim > On Mon, Mar 4, 2013 at 8:08 PM, Stephen Boesch <[EMAIL PROTECTED]> wrote: > > > great question from Kim and follow-up/answers. > > > > > > 2013/3/4 Gary Helmling <[EMAIL PROTECTED]> > > > > > I see this is HBASE-6870. I thought that sounded familiar. > > > > > > > > > On Mon, Mar 4, 2013 at 6:23 PM, Gary Helmling <[EMAIL PROTECTED]> > > wrote: > > > > > > > > > > > Check your logs for whether your end-point coprocessor is hitting > > > >> zookeeper on every invocation to figure out the region start key. > > > >> Unfortunately (at least last time I checked), the default way of > > > invoking > > > >> an end point coprocessor doesn't use the meta cache. You can go > > through > > > a > > > >> combination of the following instead: > > > >> HRegionLocation regionLocation = retried ? > > > >> connection.relocateRegion(**tableName, tableKey) : > > > >> connection.locateRegion(**tableName, tableKey); > > > >> ... > > > >> Then call HConnection.processExecs call, passing in the regionKeys > > from > > > >> above. > > > >> You can trap the error case of the region being relocated and try > > again > > > >> with retried = true and it'll update the meta data cache when > > > >> relocateRegion is called. > > > >> > > > > > > > > > > > > Any idea if we have an improvement logged in JIRA for this? This is > > > > definitely something we should improve on. > > > > > > > > > > -- Best regards, - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White) +
Anoop Sam John 2013-03-06, 03:14
-
Re: endpoint coprocessor performanceGary Helmling 2013-03-05, 01:42
>
> I'm running some experiments to understand where to use coprocessors. One > interesting scenario is computing distinct values. I ran performance tests > with two distinct value implementations: one using endpoint coprocessors, > and one using just scans (computing distinct values client side only). I > noticed that the endpoint coprocessor implementation averaged 80 ms slower > than the scan implementation. Details of that are below for anyone > interested. > > To drill into the performance, I instrumented the code and ultimately > deployed a no-op endpoint coprocessor, to look at the overhead of simply > calling it. I'm measuring around 100ms for calling my empty, no-op endpoint > coprocessor. > > 100ms to do a single no-op coprocessor call seems very high. Do you have more details of where you see the code spending time? Or even better, can you post sample code somewhere? Also, which version of HBase are you testing with? I need to do more tests, but I believe my tests are leading me to similar > conclusions drawn here: > http://hbase-coprocessor-experiments.blogspot.com/2011/05/extending.html > > I.e. if the query/scan is selective enough (I'll go out on a limb and > estimate 50-100 rows), then it's better to just perform a scan and compute > client side. Endpoint coprocessors will make sense for larger result sets > and/or scans that hit multiple regions. > > I would certainly agree with this. Coprocessor endpoints are not a replacement for the regular HBase client APIs. They're really meant to allow you to extend HBase with new capabilities. Coprocessor endpoints will allow you to parallelize operations across multiple regions, which can be a powerful capability if you need it, or will allow you to maintain some pre-computed state server-side and then easily retrieve it from the client. If you're scanning larger amounts of data and computing a much smaller result, endpoints will also save transferring the full data set over the network back to the client, but you'll still need to scan through the data server-side. In your case, are you applying the same scan options in the coprocessor (start/end row, any filtering)? > Before going too far, I wanted to check if anyone in this group has > suggestions. I.e. perhaps there are just some configuration options I've > not uncovered. Does this 100ms latency sound correct? > It would help to have more details of what your code is actually doing. Can you post an extract of what's running in the coprocessor? --gh +
Gary Helmling 2013-03-05, 01:42
|