|
|
-
counters and scanners inconsistency
Young 2012-01-14, 00:59
I'm having an odd problem with incrementing counters simultaneously during a scan (both in separate processes).
For low rate counters, there is no problem (< 1 increment per second), but for the higher rate counters (>10 increments per second), there is an inconsistency in the counter values.
Averaging the values over time gives the correct count (i.e. the counter itself is still increasing correctly), but at certain samples the counter drops down to some seemingly random number. This random number is consistent for about a day and a half then jumps to a different random number for the next day and a half - this cycle coincides exactly with compaction of the table in question.
Again, the counter value itself, when it is not equal to the random number of the day, is correct. I'm wondering if there is something going on underneath that would cause 1) the incorrect but consistent number when incrementing and scanning simultaneously 2) the random number reset and its relationship with compaction of the table
Keep in mind that most of the hbase settings are at default.
Thanks! p.s. I ran a smaller experiment using hbase shell, and found the counters to be consistent even for the high rate counters. I am wondering if there is a buffering issue with the htable scanner object if it is unable to obtain a lock on the row it will default to the data on disk?
-
Re: counters and scanners inconsistency
Todd Lipcon 2012-01-17, 02:21
Hi Young,
This is interesting and unexpected behavior. What version are you running?
If you can write a unit test (or system test) that demonstrates the problem against a running cluster, that would be excellent.
-Todd
On Fri, Jan 13, 2012 at 4:59 PM, Young <[EMAIL PROTECTED]> wrote: > I'm having an odd problem with incrementing counters simultaneously during a scan (both in separate processes). > > For low rate counters, there is no problem (< 1 increment per second), but for the higher rate counters (>10 increments per second), there is an inconsistency in the counter values. > > Averaging the values over time gives the correct count (i.e. the counter itself is still increasing correctly), but at certain samples the counter drops down to some seemingly random number. This random number is consistent for about a day and a half then jumps to a different random number for the next day and a half - this cycle coincides exactly with compaction of the table in question. > > Again, the counter value itself, when it is not equal to the random number of the day, is correct. I'm wondering if there is something going on underneath that would cause > 1) the incorrect but consistent number when incrementing and scanning simultaneously > 2) the random number reset and its relationship with compaction of the table > > Keep in mind that most of the hbase settings are at default. > > Thanks! > p.s. I ran a smaller experiment using hbase shell, and found the counters to be consistent even for the high rate counters. I am wondering if there is a buffering issue with the htable scanner object if it is unable to obtain a lock on the row it will default to the data on disk? >
-- Todd Lipcon Software Engineer, Cloudera
-
Re: counters and scanners inconsistency
Young 2012-01-18, 18:52
Hello Todd,
Thanks for pointing this out for me. The client was running 0.90.1, while the cluster was running 0.90.3. I upgraded both to the latest CDH3 distro version, 0.90.4, and the problem seems to have gone away (simultaneous scanner + inc produces consistent results). I still don't know what the root of the problem was, but this simple upgrade was enough to fix it.
Thanks!
On Jan 16, 2012, at 6:21 PM, Todd Lipcon wrote:
> Hi Young, > > This is interesting and unexpected behavior. What version are you running? > > If you can write a unit test (or system test) that demonstrates the > problem against a running cluster, that would be excellent. > > -Todd > > On Fri, Jan 13, 2012 at 4:59 PM, Young <[EMAIL PROTECTED]> wrote: >> I'm having an odd problem with incrementing counters simultaneously during a scan (both in separate processes). >> >> For low rate counters, there is no problem (< 1 increment per second), but for the higher rate counters (>10 increments per second), there is an inconsistency in the counter values. >> >> Averaging the values over time gives the correct count (i.e. the counter itself is still increasing correctly), but at certain samples the counter drops down to some seemingly random number. This random number is consistent for about a day and a half then jumps to a different random number for the next day and a half - this cycle coincides exactly with compaction of the table in question. >> >> Again, the counter value itself, when it is not equal to the random number of the day, is correct. I'm wondering if there is something going on underneath that would cause >> 1) the incorrect but consistent number when incrementing and scanning simultaneously >> 2) the random number reset and its relationship with compaction of the table >> >> Keep in mind that most of the hbase settings are at default. >> >> Thanks! >> p.s. I ran a smaller experiment using hbase shell, and found the counters to be consistent even for the high rate counters. I am wondering if there is a buffering issue with the htable scanner object if it is unable to obtain a lock on the row it will default to the data on disk? >> > > > > -- > Todd Lipcon > Software Engineer, Cloudera
|
|