|
Peter Wolf
2012-03-17, 19:46
Lars George
2012-03-18, 09:28
Peter Wolf
2012-03-18, 14:13
lars hofhansl
2012-03-18, 21:51
Peter Wolf
2012-03-18, 23:40
Lars George
2012-03-19, 09:58
Peter Wolf
2012-03-19, 18:24
lars hofhansl
2012-03-20, 15:50
Peter Wolf
2012-03-23, 01:01
Michel Segel
2012-03-23, 11:55
Peter Wolf
2012-03-23, 12:04
Doug Meil
2012-03-23, 18:41
lars hofhansl
2012-03-23, 18:49
Peter Wolf
2012-03-23, 20:31
|
-
Confirming a BugPeter Wolf 2012-03-17, 19:46
Hello,
A couple of days ago, I asked about strange behavior in my "Scan.addFamiliy reduces results" thread. I want to confirm that I did find a bug, and if so, how to submit a bug report. The basic strangeness is that changing the amount of caching, changes the number of results. In the original thread, this was confused by the fact that adding different families also changed the number of results. We thought it was a filtering problem. However, changing nothing but the setCaching() value changes the number of results. Furthermore, the result difference is a multiple of the setCaching() value. Here is the pseudo code: Scan scan = new Scan(...); scan.addFamily(...); Filter filter = ... scan.setFilter(filter); scanner = hTable.getScanner(scan); Iterator<Result> it = scanner.iterator(); while (it.hasNext()) { Result result = it.next(); ... } Thank you Peter
-
Re: Confirming a BugLars George 2012-03-18, 09:28
Hi Peter,
Could you be hitting HBASE-5121? Or even HBASE-2856? Lars On Mar 17, 2012, at 20:46, Peter Wolf <[EMAIL PROTECTED]> wrote: > Hello, > > A couple of days ago, I asked about strange behavior in my "Scan.addFamiliy reduces results" thread. > > I want to confirm that I did find a bug, and if so, how to submit a bug report. > > The basic strangeness is that changing the amount of caching, changes the number of results. In the original thread, this was confused by the fact that adding different families also changed the number of results. We thought it was a filtering problem. > > However, changing nothing but the setCaching() value changes the number of results. Furthermore, the result difference is a multiple of the setCaching() value. > > Here is the pseudo code: > > Scan scan = new Scan(...); > scan.addFamily(...); > Filter filter = ... > scan.setFilter(filter); > > --> scan.setCaching(10000); <-- > > scanner = hTable.getScanner(scan); > Iterator<Result> it = scanner.iterator(); > while (it.hasNext()) { > Result result = it.next(); > ... > } > > > Thank you > Peter
-
Re: Confirming a BugPeter Wolf 2012-03-18, 14:13
Hi Lars,
I don't think so... My behavior is definitely tied to the amount of data in each Result. There definitely seems to be some sort of threshold. Changing the caching amount produces a completely repeatable behavior. 10,000, 5,000, and 1000 each produce different repeatable results, and changing the families added as produces different reliable results. There is no "sometimes" or "occasional", and if there was a Major Compaction, it wouldn't happen that often. https://issues.apache.org/jira/browse/HBASE-5121 https://issues.apache.org/jira/browse/HBASE-2856 Note that with all my families added each result is a few 1000 bytes big. Is that unusually large? Thanks P On 3/18/12 5:28 AM, Lars George wrote: > Hi Peter, > > Could you be hitting HBASE-5121? Or even HBASE-2856? > > Lars > > On Mar 17, 2012, at 20:46, Peter Wolf<[EMAIL PROTECTED]> wrote: > >> Hello, >> >> A couple of days ago, I asked about strange behavior in my "Scan.addFamiliy reduces results" thread. >> >> I want to confirm that I did find a bug, and if so, how to submit a bug report. >> >> The basic strangeness is that changing the amount of caching, changes the number of results. In the original thread, this was confused by the fact that adding different families also changed the number of results. We thought it was a filtering problem. >> >> However, changing nothing but the setCaching() value changes the number of results. Furthermore, the result difference is a multiple of the setCaching() value. >> >> Here is the pseudo code: >> >> Scan scan = new Scan(...); >> scan.addFamily(...); >> Filter filter = ... >> scan.setFilter(filter); >> >> --> scan.setCaching(10000);<-- >> >> scanner = hTable.getScanner(scan); >> Iterator<Result> it = scanner.iterator(); >> while (it.hasNext()) { >> Result result = it.next(); >> ... >> } >> >> >> Thank you >> Peter
-
Re: Confirming a Buglars hofhansl 2012-03-18, 21:51
Hi Peter,
(this is the other Lars) Does this depend on your dataset at all? Does not it also happen for smaller value of scanner caching? Any chance that you can reproduce this in a unittest and file a jira? If you do (specifically the test), I'll promise I'll look at it this week :) -- Lars (H) ________________________________ From: Peter Wolf <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Sent: Sunday, March 18, 2012 7:13 AM Subject: Re: Confirming a Bug Hi Lars, I don't think so... My behavior is definitely tied to the amount of data in each Result. There definitely seems to be some sort of threshold. Changing the caching amount produces a completely repeatable behavior. 10,000, 5,000, and 1000 each produce different repeatable results, and changing the families added as produces different reliable results. There is no "sometimes" or "occasional", and if there was a Major Compaction, it wouldn't happen that often. https://issues.apache.org/jira/browse/HBASE-5121 https://issues.apache.org/jira/browse/HBASE-2856 Note that with all my families added each result is a few 1000 bytes big. Is that unusually large? Thanks P On 3/18/12 5:28 AM, Lars George wrote: > Hi Peter, > > Could you be hitting HBASE-5121? Or even HBASE-2856? > > Lars > > On Mar 17, 2012, at 20:46, Peter Wolf<[EMAIL PROTECTED]> wrote: > >> Hello, >> >> A couple of days ago, I asked about strange behavior in my "Scan.addFamiliy reduces results" thread. >> >> I want to confirm that I did find a bug, and if so, how to submit a bug report. >> >> The basic strangeness is that changing the amount of caching, changes the number of results. In the original thread, this was confused by the fact that adding different families also changed the number of results. We thought it was a filtering problem. >> >> However, changing nothing but the setCaching() value changes the number of results. Furthermore, the result difference is a multiple of the setCaching() value. >> >> Here is the pseudo code: >> >> Scan scan = new Scan(...); >> scan.addFamily(...); >> Filter filter = ... >> scan.setFilter(filter); >> >> --> scan.setCaching(10000);<-- >> >> scanner = hTable.getScanner(scan); >> Iterator<Result> it = scanner.iterator(); >> while (it.hasNext()) { >> Result result = it.next(); >> ... >> } >> >> >> Thank you >> Peter
-
Re: Confirming a BugPeter Wolf 2012-03-18, 23:40
Excellent! Thank you very much (other) Lars.
I have only tested this one one dataset, and only on a few values of caching. I certainly get different results with 10,000 5,000 and 1,000 caching. 1,000 gives me the same results as default. I also get different results when I add families to the Scan. I seem to be surpassing some maximum buffer size. The number of results is always the correct value - some multiple of the cache size. For example, the correct value was 24,452, but when caching was set to 10,000, I got 4,452 results. When I then removed a family from the scan, I got 14,452 results. I'll try to write a standalone program to reproduce this. I'll get back to you soon. P P.S. I just want to check. The following code counts the number of results. I don't need to do anything to "get the next cache" or something do I? Iterator<Result> it = scanner.iterator(); while (it.hasNext()) { Result result = it.next(); ... } On 3/18/12 5:51 PM, lars hofhansl wrote: > Hi Peter, > > (this is the other Lars) > > > Does this depend on your dataset at all? Does not it also happen for smaller value of scanner caching? > > > Any chance that you can reproduce this in a unittest and file a jira? > If you do (specifically the test), I'll promise I'll look at it this week :) > > > -- Lars (H) > > > > ________________________________ > From: Peter Wolf<[EMAIL PROTECTED]> > To: [EMAIL PROTECTED] > Sent: Sunday, March 18, 2012 7:13 AM > Subject: Re: Confirming a Bug > > Hi Lars, > > I don't think so... My behavior is definitely tied to the amount of > data in each Result. There definitely seems to be some sort of > threshold. Changing the caching amount produces a completely repeatable > behavior. 10,000, 5,000, and 1000 each produce different repeatable > results, and changing the families added as produces different reliable > results. There is no "sometimes" or "occasional", and if there was a > Major Compaction, it wouldn't happen that often. > > https://issues.apache.org/jira/browse/HBASE-5121 > https://issues.apache.org/jira/browse/HBASE-2856 > > Note that with all my families added each result is a few 1000 bytes > big. Is that unusually large? > > Thanks > P > > > > On 3/18/12 5:28 AM, Lars George wrote: >> Hi Peter, >> >> Could you be hitting HBASE-5121? Or even HBASE-2856? >> >> Lars >> >> On Mar 17, 2012, at 20:46, Peter Wolf<[EMAIL PROTECTED]> wrote: >> >>> Hello, >>> >>> A couple of days ago, I asked about strange behavior in my "Scan.addFamiliy reduces results" thread. >>> >>> I want to confirm that I did find a bug, and if so, how to submit a bug report. >>> >>> The basic strangeness is that changing the amount of caching, changes the number of results. In the original thread, this was confused by the fact that adding different families also changed the number of results. We thought it was a filtering problem. >>> >>> However, changing nothing but the setCaching() value changes the number of results. Furthermore, the result difference is a multiple of the setCaching() value. >>> >>> Here is the pseudo code: >>> >>> Scan scan = new Scan(...); >>> scan.addFamily(...); >>> Filter filter = ... >>> scan.setFilter(filter); >>> >>> --> scan.setCaching(10000);<-- >>> >>> scanner = hTable.getScanner(scan); >>> Iterator<Result> it = scanner.iterator(); >>> while (it.hasNext()) { >>> Result result = it.next(); >>> ... >>> } >>> >>> >>> Thank you >>> Peter
-
Re: Confirming a BugLars George 2012-03-19, 09:58
Hi Peter,
Lars #1 here again :) That is fine, the caching is done transparently for you. But what I also suggest is counting the number of KeyValues you get back, just to confirm. In other words, iterate over the result and check how many actual KVs you get back. The reason I am asking is that for example scanner batching will change the behavior, you will get a Result instance per batch, not per row. Thanks for digging in! Lars On Mar 19, 2012, at 12:40 AM, Peter Wolf wrote: > Excellent! Thank you very much (other) Lars. > > I have only tested this one one dataset, and only on a few values of caching. I certainly get different results with 10,000 5,000 and 1,000 caching. 1,000 gives me the same results as default. I also get different results when I add families to the Scan. > > I seem to be surpassing some maximum buffer size. The number of results is always the correct value - some multiple of the cache size. For example, the correct value was 24,452, but when caching was set to 10,000, I got 4,452 results. When I then removed a family from the scan, I got 14,452 results. > > I'll try to write a standalone program to reproduce this. I'll get back to you soon. > > P > > P.S. I just want to check. The following code counts the number of results. I don't need to do anything to "get the next cache" or something do I? > > Iterator<Result> it = scanner.iterator(); > while (it.hasNext()) { > Result result = it.next(); > ... > } > > > > > On 3/18/12 5:51 PM, lars hofhansl wrote: >> Hi Peter, >> >> (this is the other Lars) >> >> >> Does this depend on your dataset at all? Does not it also happen for smaller value of scanner caching? >> >> >> Any chance that you can reproduce this in a unittest and file a jira? >> If you do (specifically the test), I'll promise I'll look at it this week :) >> >> >> -- Lars (H) >> >> >> >> ________________________________ >> From: Peter Wolf<[EMAIL PROTECTED]> >> To: [EMAIL PROTECTED] >> Sent: Sunday, March 18, 2012 7:13 AM >> Subject: Re: Confirming a Bug >> >> Hi Lars, >> >> I don't think so... My behavior is definitely tied to the amount of >> data in each Result. There definitely seems to be some sort of >> threshold. Changing the caching amount produces a completely repeatable >> behavior. 10,000, 5,000, and 1000 each produce different repeatable >> results, and changing the families added as produces different reliable >> results. There is no "sometimes" or "occasional", and if there was a >> Major Compaction, it wouldn't happen that often. >> >> https://issues.apache.org/jira/browse/HBASE-5121 >> https://issues.apache.org/jira/browse/HBASE-2856 >> >> Note that with all my families added each result is a few 1000 bytes >> big. Is that unusually large? >> >> Thanks >> P >> >> >> >> On 3/18/12 5:28 AM, Lars George wrote: >>> Hi Peter, >>> >>> Could you be hitting HBASE-5121? Or even HBASE-2856? >>> >>> Lars >>> >>> On Mar 17, 2012, at 20:46, Peter Wolf<[EMAIL PROTECTED]> wrote: >>> >>>> Hello, >>>> >>>> A couple of days ago, I asked about strange behavior in my "Scan.addFamiliy reduces results" thread. >>>> >>>> I want to confirm that I did find a bug, and if so, how to submit a bug report. >>>> >>>> The basic strangeness is that changing the amount of caching, changes the number of results. In the original thread, this was confused by the fact that adding different families also changed the number of results. We thought it was a filtering problem. >>>> >>>> However, changing nothing but the setCaching() value changes the number of results. Furthermore, the result difference is a multiple of the setCaching() value. >>>> >>>> Here is the pseudo code: >>>> >>>> Scan scan = new Scan(...); >>>> scan.addFamily(...); >>>> Filter filter = ... >>>> scan.setFilter(filter); >>>> >>>> --> scan.setCaching(10000);<-- >>>> >>>> scanner = hTable.getScanner(scan);
-
Re: Confirming a BugPeter Wolf 2012-03-19, 18:24
Hello Lars and Lars,
Thank you for you help and attention. I wrote a standalone test that exhibits the bug. http://dl.dropbox.com/u/68001072/HBaseScanCacheBug.java Here is the output. It shows how the number of results and key value pairs varies as caching in changed, and families are included. It shows the bug starting with 3 families and 5000 caching. It also shows a new bug, where the query always fails with an IOException with 4 families. CacheSize FamilyCount ResultCount KeyValueCount 1000 1 10000 10 5000 1 10000 10 10000 1 10000 10 1000 2 10000 20 5000 2 10000 20 10000 2 10000 20 1000 3 10000 30 5000 3 5000 30 10000 3 0 -1 Exception in thread "main" java.lang.RuntimeException: org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to contact region server domu-12-31-39-05-6d-02.compute-1.internal:60020 for region bug,,1332174647830.ef906b7bd8eea8482c84edd906df24fd., row '\x00\x00\x00{\x00\x00\x00\x00\x00\x00\x00\x00', but failed after 10 attempts. Exceptions: java.io.IOException: java.io.IOException: Call to ... failed on local exception: java.io.IOException: Unexpected exception receiving call responses at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithRetries(HConnectionManager.java:1231) at org.apache.hadoop.hbase.client.HTable$ClientScanner.next(HTable.java:1170) at org.apache.hadoop.hbase.client.HTable$ClientScanner$1.hasNext(HTable.java:1275) ... 7 more Here is the main(). Note that createTable() and createData() are commented out. Uncomment these to populate the test table. public static void main(String[] args) { try { //createTable("bug"); HBaseScanCacheBug bug = new HBaseScanCacheBug("bug"); int id = 123; //bug.createData(id); System.out.println("CacheSize FamilyCount ResultCount KeyValueCount"); for (int familyCount = 1; familyCount < 5; familyCount++) { bug.scan(id, 1000, familyCount); bug.scan(id, 5000, familyCount); bug.scan(id, 10000, familyCount); } } catch (IOException e) { throw new Error(e); } } private static Configuration getConfiguration() { Configuration conf = HBaseConfiguration.create(); conf.set("hbase.zookeeper.quorum", "Put Your Server Here"); conf.setInt("hbase.client.prefetch.limit", 100); return conf; } On 3/19/12 5:58 AM, Lars George wrote: > Hi Peter, > > Lars #1 here again :) > > That is fine, the caching is done transparently for you. But what I also suggest is counting the number of KeyValues you get back, just to confirm. In other words, iterate over the result and check how many actual KVs you get back. The reason I am asking is that for example scanner batching will change the behavior, you will get a Result instance per batch, not per row. > > Thanks for digging in! > > Lars > > On Mar 19, 2012, at 12:40 AM, Peter Wolf wrote: > >> Excellent! Thank you very much (other) Lars. >> >> I have only tested this one one dataset, and only on a few values of caching. I certainly get different results with 10,000 5,000 and 1,000 caching. 1,000 gives me the same results as default. I also get different results when I add families to the Scan. >> >> I seem to be surpassing some maximum buffer size. The number of results is always the correct value - some multiple of the cache size. For example, the correct value was 24,452, but when caching was set to 10,000, I got 4,452 results. When I then removed a family from the scan, I got 14,452 results. >> >> I'll try to write a standalone program to reproduce this. I'll get back to you soon. >> >> P >> >> P.S. I just want to check. The following code counts the number of results. I don't need to do anything to "get the next cache" or something do I? >> >> Iterator<Result> it = scanner.iterator();
-
Re: Confirming a Buglars hofhansl 2012-03-20, 15:50
Thanks Peter,
will have a look today. -- Lars ________________________________ From: Peter Wolf <[EMAIL PROTECTED]> To: [EMAIL PROTECTED]; [EMAIL PROTECTED]; lars hofhansl <[EMAIL PROTECTED]> Sent: Monday, March 19, 2012 11:24 AM Subject: Re: Confirming a Bug Hello Lars and Lars, Thank you for you help and attention. I wrote a standalone test that exhibits the bug. http://dl.dropbox.com/u/68001072/HBaseScanCacheBug.java Here is the output. It shows how the number of results and key value pairs varies as caching in changed, and families are included. It shows the bug starting with 3 families and 5000 caching. It also shows a new bug, where the query always fails with an IOException with 4 families. CacheSize FamilyCount ResultCount KeyValueCount 1000 1 10000 10 5000 1 10000 10 10000 1 10000 10 1000 2 10000 20 5000 2 10000 20 10000 2 10000 20 1000 3 10000 30 5000 3 5000 30 10000 3 0 -1 Exception in thread "main" java.lang.RuntimeException: org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to contact region server domu-12-31-39-05-6d-02.compute-1.internal:60020 for region bug,,1332174647830.ef906b7bd8eea8482c84edd906df24fd., row '\x00\x00\x00{\x00\x00\x00\x00\x00\x00\x00\x00', but failed after 10 attempts. Exceptions: java.io.IOException: java.io.IOException: Call to ... failed on local exception: java.io.IOException: Unexpected exception receiving call responses at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithRetries(HConnectionManager.java:1231) at org.apache.hadoop.hbase.client.HTable$ClientScanner.next(HTable.java:1170) at org.apache.hadoop.hbase.client.HTable$ClientScanner$1.hasNext(HTable.java:1275) ... 7 more Here is the main(). Note that createTable() and createData() are commented out. Uncomment these to populate the test table. public static void main(String[] args) { try { //createTable("bug"); HBaseScanCacheBug bug = new HBaseScanCacheBug("bug"); int id = 123; //bug.createData(id); System.out.println("CacheSize FamilyCount ResultCount KeyValueCount"); for (int familyCount = 1; familyCount < 5; familyCount++) { bug.scan(id, 1000, familyCount); bug.scan(id, 5000, familyCount); bug.scan(id, 10000, familyCount); } } catch (IOException e) { throw new Error(e); } } private static Configuration getConfiguration() { Configuration conf = HBaseConfiguration.create(); conf.set("hbase.zookeeper.quorum", "Put Your Server Here"); conf.setInt("hbase.client.prefetch.limit", 100); return conf; } On 3/19/12 5:58 AM, Lars George wrote: > Hi Peter, > > Lars #1 here again :) > > That is fine, the caching is done transparently for you. But what I also suggest is counting the number of KeyValues you get back, just to confirm. In other words, iterate over the result and check how many actual KVs you get back. The reason I am asking is that for example scanner batching will change the behavior, you will get a Result instance per batch, not per row. > > Thanks for digging in! > > Lars > > On Mar 19, 2012, at 12:40 AM, Peter Wolf wrote: > >> Excellent! Thank you very much (other) Lars. >> >> I have only tested this one one dataset, and only on a few values of caching. I certainly get different results with 10,000 5,000 and 1,000 caching. 1,000 gives me the same results as default. I also get different results when I add families to the Scan. >> >> I seem to be surpassing some maximum buffer size. The number of results is always the correct value - some multiple of the cache size. For example, the correct value was 24,452, but when caching was set to 10,000, I got 4,452 results. When I then removed a family from the scan, I got 14,452 results. >> >> I'll try to write a standalone program to reproduce this. I'll get back to you soon.
-
Re: Confirming a BugPeter Wolf 2012-03-23, 01:01
Hello again Lars and Lars,
Here is some additional information that may help you track this down. I think this behavior has something to do with my VPN. My servers are on the Amazon Cloud and I normally run my client on my laptop via a VPN (Tunnelblick: OS X 10.7.3; Tunnelblick 3.2.3 (build 2891.2932)). This is where I see the buggy behavior I describe. However, when my Client is running on an EC2 machine, then I get different behavior. I can not prove that it is always correct, but in at least one case my current code does not work on my laptop, but gets the correct number of results on an EC2 machine. Note that my scans are also much faster on the EC2 machine. I will do more tests to see if I can localize it further. Hope this helps Thank you again Peter On 3/19/12 2:24 PM, Peter Wolf wrote: > Hello Lars and Lars, > > Thank you for you help and attention. > > I wrote a standalone test that exhibits the bug. > > http://dl.dropbox.com/u/68001072/HBaseScanCacheBug.java > > Here is the output. It shows how the number of results and key value > pairs varies as caching in changed, and families are included. It > shows the bug starting with 3 families and 5000 caching. It also > shows a new bug, where the query always fails with an IOException with > 4 families. > > CacheSize FamilyCount ResultCount KeyValueCount > 1000 1 10000 10 > 5000 1 10000 10 > 10000 1 10000 10 > 1000 2 10000 20 > 5000 2 10000 20 > 10000 2 10000 20 > 1000 3 10000 30 > 5000 3 5000 30 > 10000 3 0 -1 > Exception in thread "main" java.lang.RuntimeException: > org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to > contact region server domu-12-31-39-05-6d-02.compute-1.internal:60020 > for region bug,,1332174647830.ef906b7bd8eea8482c84edd906df24fd., row > '\x00\x00\x00{\x00\x00\x00\x00\x00\x00\x00\x00', but failed after 10 > attempts. > Exceptions: > java.io.IOException: java.io.IOException: Call to ... failed on local > exception: java.io.IOException: Unexpected exception receiving call > responses > at > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithRetries(HConnectionManager.java:1231) > at > org.apache.hadoop.hbase.client.HTable$ClientScanner.next(HTable.java:1170) > at > org.apache.hadoop.hbase.client.HTable$ClientScanner$1.hasNext(HTable.java:1275) > ... 7 more > > > Here is the main(). Note that createTable() and createData() are > commented out. Uncomment these to populate the test table. > > public static void main(String[] args) { > > try { > //createTable("bug"); > > HBaseScanCacheBug bug = new HBaseScanCacheBug("bug"); > int id = 123; > > //bug.createData(id); > > System.out.println("CacheSize FamilyCount ResultCount > KeyValueCount"); > for (int familyCount = 1; familyCount < 5; familyCount++) { > bug.scan(id, 1000, familyCount); > bug.scan(id, 5000, familyCount); > bug.scan(id, 10000, familyCount); > } > > } catch (IOException e) { > throw new Error(e); > } > > } > > private static Configuration getConfiguration() { > Configuration conf = HBaseConfiguration.create(); > conf.set("hbase.zookeeper.quorum", "Put Your Server Here"); > conf.setInt("hbase.client.prefetch.limit", 100); > return conf; > } > > > > On 3/19/12 5:58 AM, Lars George wrote: >> Hi Peter, >> >> Lars #1 here again :) >> >> That is fine, the caching is done transparently for you. But what I >> also suggest is counting the number of KeyValues you get back, just >> to confirm. In other words, iterate over the result and check how >> many actual KVs you get back. The reason I am asking is that for >> example scanner batching will change the behavior, you will get a >> Result instance per batch, not per row. >> >> Thanks for digging in! >> >> Lars >> >> On Mar 19, 2012, at 12:40 AM, Peter Wolf wrote:
-
Re: Confirming a BugMichel Segel 2012-03-23, 11:55
Peter, that doesnt make sense.
I mean I believe you in what you are saying, but don't see how a VPN in would cause this variance in results. Do you have any speculative execution turned on? Are you counting just the numbers of rows in the result set, or are you using counters in the map reduce? (I'm assuming that you are running a map/reduce, and not just a simple connection and single threaded scan...). I apologize if this had already been answered, I hadn't been following this too closely. Sent from a remote device. Please excuse any typos... Mike Segel On Mar 22, 2012, at 8:01 PM, Peter Wolf <[EMAIL PROTECTED]> wrote: > Hello again Lars and Lars, > > Here is some additional information that may help you track this down. > > I think this behavior has something to do with my VPN. My servers are on the Amazon Cloud and I normally run my client on my laptop via a VPN (Tunnelblick: OS X 10.7.3; Tunnelblick 3.2.3 (build 2891.2932)). This is where I see the buggy behavior I describe. > > However, when my Client is running on an EC2 machine, then I get different behavior. I can not prove that it is always correct, but in at least one case my current code does not work on my laptop, but gets the correct number of results on an EC2 machine. Note that my scans are also much faster on the EC2 machine. > > I will do more tests to see if I can localize it further. > > Hope this helps > Thank you again > Peter > > > On 3/19/12 2:24 PM, Peter Wolf wrote: >> Hello Lars and Lars, >> >> Thank you for you help and attention. >> >> I wrote a standalone test that exhibits the bug. >> >> http://dl.dropbox.com/u/68001072/HBaseScanCacheBug.java >> >> Here is the output. It shows how the number of results and key value pairs varies as caching in changed, and families are included. It shows the bug starting with 3 families and 5000 caching. It also shows a new bug, where the query always fails with an IOException with 4 families. >> >> CacheSize FamilyCount ResultCount KeyValueCount >> 1000 1 10000 10 >> 5000 1 10000 10 >
-
Re: Confirming a BugPeter Wolf 2012-03-23, 12:04
Hi Michel,
I agree it doesn't make sense, but then I believe we are tracking a bug. I don't know about speculative execution, but I certainly did not switch it on. I am just counting the number of rows that come back in the Result. If you are interested in this, try my Unit test. I'd be very interested to see if behaves the same for others. http://dl.dropbox.com/u/68001072/HBaseScanCacheBug.java Here is the output. It shows how the number of results and key value pairs varies as caching in changed, and families are included. It shows the bug starting with 3 families and 5000 caching. It also shows a new bug, where the query always fails with an IOException with 4 families. CacheSize FamilyCount ResultCount KeyValueCount 1000 1 10000 10 5000 1 10000 10 On 3/23/12 7:55 AM, Michel Segel wrote: > Peter, that doesnt make sense. > > I mean I believe you in what you are saying, but don't see how a VPN in would cause this variance in results. > > Do you have any speculative execution turned on? > > Are you counting just the numbers of rows in the result set, or are you using counters in the map reduce? (I'm assuming that you are running a map/reduce, and not just a simple connection and single threaded scan...). > > I apologize if this had already been answered, I hadn't been following this too closely. > > Sent from a remote device. Please excuse any typos... > > Mike Segel > > On Mar 22, 2012, at 8:01 PM, Peter Wolf<[EMAIL PROTECTED]> wrote: > >> Hello again Lars and Lars, >> >> Here is some additional information that may help you track this down. >> >> I think this behavior has something to do with my VPN. My servers are on the Amazon Cloud and I normally run my client on my laptop via a VPN (Tunnelblick: OS X 10.7.3; Tunnelblick 3.2.3 (build 2891.2932)). This is where I see the buggy behavior I describe. >> >> However, when my Client is running on an EC2 machine, then I get different behavior. I can not prove that it is always correct, but in at least one case my current code does not work on my laptop, but gets the correct number of results on an EC2 machine. Note that my scans are also much faster on the EC2 machine. >> >> I will do more tests to see if I can localize it further. >> >> Hope this helps >> Thank you again >> Peter >> >> >> On 3/19/12 2:24 PM, Peter Wolf wrote: >>> Hello Lars and Lars, >>> >>> Thank you for you help and attention. >>> >>> I wrote a standalone test that exhibits the bug. >>> >>> http://dl.dropbox.com/u/68001072/HBaseScanCacheBug.java >>> >>> Here is the output. It shows how the number of results and key value pairs varies as caching in changed, and families are included. It shows the bug starting with 3 families and 5000 caching. It also shows a new bug, where the query always fails with an IOException with 4 families. >>> >>> CacheSize FamilyCount ResultCount KeyValueCount >>> 1000 1 10000 10 >>> 5000 1 10000 10
-
Re: Confirming a BugDoug Meil 2012-03-23, 18:41
Speculative execution is on by default. http://hbase.apache.org/book.html#mapreduce.specex On 3/23/12 8:04 AM, "Peter Wolf" <[EMAIL PROTECTED]> wrote: >Hi Michel, > >I agree it doesn't make sense, but then I believe we are tracking a bug. > >I don't know about speculative execution, but I certainly did not switch >it on. > >I am just counting the number of rows that come back in the Result. > >If you are interested in this, try my Unit test. I'd be very interested >to see if behaves the same for others. > >http://dl.dropbox.com/u/68001072/HBaseScanCacheBug.java > > >Here is the output. It shows how the number of results and key value >pairs varies as caching in changed, and families are included. It shows >the bug starting with 3 families and 5000 caching. It also shows a new >bug, where the query always fails with an IOException with 4 families. > >CacheSize FamilyCount ResultCount KeyValueCount >1000 1 10000 10 >5000 1 10000 10 > > > >On 3/23/12 7:55 AM, Michel Segel wrote: >> Peter, that doesnt make sense. >> >> I mean I believe you in what you are saying, but don't see how a VPN in >>would cause this variance in results. >> >> Do you have any speculative execution turned on? >> >> Are you counting just the numbers of rows in the result set, or are you >>using counters in the map reduce? (I'm assuming that you are running a >>map/reduce, and not just a simple connection and single threaded >>scan...). >> >> I apologize if this had already been answered, I hadn't been following >>this too closely. >> >> Sent from a remote device. Please excuse any typos... >> >> Mike Segel >> >> On Mar 22, 2012, at 8:01 PM, Peter Wolf<[EMAIL PROTECTED]> wrote: >> >>> Hello again Lars and Lars, >>> >>> Here is some additional information that may help you track this down. >>> >>> I think this behavior has something to do with my VPN. My servers are >>>on the Amazon Cloud and I normally run my client on my laptop via a VPN >>>(Tunnelblick: OS X 10.7.3; Tunnelblick 3.2.3 (build 2891.2932)). This >>>is where I see the buggy behavior I describe. >>> >>> However, when my Client is running on an EC2 machine, then I get >>>different behavior. I can not prove that it is always correct, but in >>>at least one case my current code does not work on my laptop, but gets >>>the correct number of results on an EC2 machine. Note that my scans >>>are also much faster on the EC2 machine. >>> >>> I will do more tests to see if I can localize it further. >>> >>> Hope this helps >>> Thank you again >>> Peter >>> >>> >>> On 3/19/12 2:24 PM, Peter Wolf wrote: >>>> Hello Lars and Lars, >>>> >>>> Thank you for you help and attention. >>>> >>>> I wrote a standalone test that exhibits the bug. >>>> >>>> http://dl.dropbox.com/u/68001072/HBaseScanCacheBug.java >>>> >>>> Here is the output. It shows how the number of results and key value >>>>pairs varies as caching in changed, and families are included. It >>>>shows the bug starting with 3 families and 5000 caching. It also >>>>shows a new bug, where the query always fails with an IOException with >>>>4 families. >>>> >>>> CacheSize FamilyCount ResultCount KeyValueCount >>>> 1000 1 10000 10 >>>> 5000 1 10000 10 > >
-
Re: Confirming a Buglars hofhansl 2012-03-23, 18:49
Sorry... Distracted by trying to do a 0.94rc.
VPN... Hmm... Do you see any packet fragmentation/truncation? -- Lars ----- Original Message ----- From: Peter Wolf <[EMAIL PROTECTED]> To: [EMAIL PROTECTED]; [EMAIL PROTECTED]; lars hofhansl <[EMAIL PROTECTED]> Cc: Sent: Thursday, March 22, 2012 6:01 PM Subject: Re: Confirming a Bug Hello again Lars and Lars, Here is some additional information that may help you track this down. I think this behavior has something to do with my VPN. My servers are on the Amazon Cloud and I normally run my client on my laptop via a VPN (Tunnelblick: OS X 10.7.3; Tunnelblick 3.2.3 (build 2891.2932)). This is where I see the buggy behavior I describe. However, when my Client is running on an EC2 machine, then I get different behavior. I can not prove that it is always correct, but in at least one case my current code does not work on my laptop, but gets the correct number of results on an EC2 machine. Note that my scans are also much faster on the EC2 machine. I will do more tests to see if I can localize it further. Hope this helps Thank you again Peter On 3/19/12 2:24 PM, Peter Wolf wrote: > Hello Lars and Lars, > > Thank you for you help and attention. > > I wrote a standalone test that exhibits the bug. > > http://dl.dropbox.com/u/68001072/HBaseScanCacheBug.java > > Here is the output. It shows how the number of results and key value pairs varies as caching in changed, and families are included. It shows the bug starting with 3 families and 5000 caching. It also shows a new bug, where the query always fails with an IOException with 4 families. > > CacheSize FamilyCount ResultCount KeyValueCount > 1000 1 10000 10 > 5000 1 10000 10 > 10000 1 10000 10 > 1000 2 10000 20 > 5000 2 10000 20 > 10000 2 10000 20 > 1000 3 10000 30 > 5000 3 5000 30 > 10000 3 0 -1 > Exception in thread "main" java.lang.RuntimeException: org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to contact region server domu-12-31-39-05-6d-02.compute-1.internal:60020 for region bug,,1332174647830.ef906b7bd8eea8482c84edd906df24fd., row '\x00\x00\x00{\x00\x00\x00\x00\x00\x00\x00\x00', but failed after 10 attempts. > Exceptions: > java.io.IOException: java.io.IOException: Call to ... failed on local exception: java.io.IOException: Unexpected exception receiving call responses > at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithRetries(HConnectionManager.java:1231) > at org.apache.hadoop.hbase.client.HTable$ClientScanner.next(HTable.java:1170) > at org.apache.hadoop.hbase.client.HTable$ClientScanner$1.hasNext(HTable.java:1275) > ... 7 more > > > Here is the main(). Note that createTable() and createData() are commented out. Uncomment these to populate the test table. > > public static void main(String[] args) { > > try { > //createTable("bug"); > > HBaseScanCacheBug bug = new HBaseScanCacheBug("bug"); > int id = 123; > > //bug.createData(id); > > System.out.println("CacheSize FamilyCount ResultCount KeyValueCount"); > for (int familyCount = 1; familyCount < 5; familyCount++) { > bug.scan(id, 1000, familyCount); > bug.scan(id, 5000, familyCount); > bug.scan(id, 10000, familyCount); > } > > } catch (IOException e) { > throw new Error(e); > } > > } > > private static Configuration getConfiguration() { > Configuration conf = HBaseConfiguration.create(); > conf.set("hbase.zookeeper.quorum", "Put Your Server Here"); > conf.setInt("hbase.client.prefetch.limit", 100); > return conf; > } > > > > On 3/19/12 5:58 AM, Lars George wrote: >> Hi Peter, >> >> Lars #1 here again :) >> >> That is fine, the caching is done transparently for you. But what I also suggest is counting the number of KeyValues you get back, just to confirm. In other words, iterate over the result and check how many actual KVs you get back. The reason I am asking is that for example scanner batching will change the behavior, you will get a Result instance per batch, not per row.
-
Re: Confirming a BugPeter Wolf 2012-03-23, 20:31
No problem. Still trying to get a handle on when it happens.
There is no error, and the results seem valid. There are just not enough of them. Would packet fragmentation/truncation cause errors or corruption? P On 3/23/12 2:49 PM, lars hofhansl wrote: > Sorry... Distracted by trying to do a 0.94rc. > VPN... Hmm... Do you see any packet fragmentation/truncation? > > -- Lars > > > > ----- Original Message ----- > From: Peter Wolf<[EMAIL PROTECTED]> > To: [EMAIL PROTECTED]; [EMAIL PROTECTED]; lars hofhansl<[EMAIL PROTECTED]> > Cc: > Sent: Thursday, March 22, 2012 6:01 PM > Subject: Re: Confirming a Bug > > Hello again Lars and Lars, > > Here is some additional information that may help you track this down. > > I think this behavior has something to do with my VPN. My servers are on the Amazon Cloud and I normally run my client on my laptop via a VPN (Tunnelblick: OS X 10.7.3; Tunnelblick 3.2.3 (build 2891.2932)). This is where I see the buggy behavior I describe. > > However, when my Client is running on an EC2 machine, then I get different behavior. I can not prove that it is always correct, but in at least one case my current code does not work on my laptop, but gets the correct number of results on an EC2 machine. Note that my scans are also much faster on the EC2 machine. > > I will do more tests to see if I can localize it further. > > Hope this helps > Thank you again > Peter > > > On 3/19/12 2:24 PM, Peter Wolf wrote: >> Hello Lars and Lars, >> >> Thank you for you help and attention. >> >> I wrote a standalone test that exhibits the bug. >> >> http://dl.dropbox.com/u/68001072/HBaseScanCacheBug.java >> >> Here is the output. It shows how the number of results and key value pairs varies as caching in changed, and families are included. It shows the bug starting with 3 families and 5000 caching. It also shows a new bug, where the query always fails with an IOException with 4 families. >> >> CacheSize FamilyCount ResultCount KeyValueCount >> 1000 1 10000 10 >> 5000 1 10000 10 >> 10000 1 10000 10 >> 1000 2 10000 20 >> 5000 2 10000 20 >> 10000 2 10000 20 >> 1000 3 10000 30 >> 5000 3 5000 30 >> 10000 3 0 -1 >> Exception in thread "main" java.lang.RuntimeException: org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to contact region server domu-12-31-39-05-6d-02.compute-1.internal:60020 for region bug,,1332174647830.ef906b7bd8eea8482c84edd906df24fd., row '\x00\x00\x00{\x00\x00\x00\x00\x00\x00\x00\x00', but failed after 10 attempts. >> Exceptions: >> java.io.IOException: java.io.IOException: Call to ... failed on local exception: java.io.IOException: Unexpected exception receiving call responses >> at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithRetries(HConnectionManager.java:1231) >> at org.apache.hadoop.hbase.client.HTable$ClientScanner.next(HTable.java:1170) >> at org.apache.hadoop.hbase.client.HTable$ClientScanner$1.hasNext(HTable.java:1275) >> ... 7 more >> >> >> Here is the main(). Note that createTable() and createData() are commented out. Uncomment these to populate the test table. >> >> public static void main(String[] args) { >> >> try { >> //createTable("bug"); >> >> HBaseScanCacheBug bug = new HBaseScanCacheBug("bug"); >> int id = 123; >> >> //bug.createData(id); >> >> System.out.println("CacheSize FamilyCount ResultCount KeyValueCount"); >> for (int familyCount = 1; familyCount< 5; familyCount++) { >> bug.scan(id, 1000, familyCount); >> bug.scan(id, 5000, familyCount); >> bug.scan(id, 10000, familyCount); >> } >> >> } catch (IOException e) { >> throw new Error(e); >> } >> >> } >> >> private static Configuration getConfiguration() { >> Configuration conf = HBaseConfiguration.create(); |