Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # dev >> MR job "randomly" scans up thousands of rows less than the it should.

Copy link to this message
Re: MR job "randomly" scans up thousands of rows less than the it should.
Following up on this.

Back porting HBASE-4485 didn't seem to help.
We were a bit under pressure and I didn't have time to investigate deeper
(there's a small chance I missed something during back port)

We eventually upgraded to 0.92 which fixed the problem :)

Thanks a lot for helping with this,

On 2/15/12 1:33 PM, "Cosmin Lehene" <[EMAIL PROTECTED]> wrote:

>Amit, HBASE-4485 describes the behavior I'm seeing, thanks.
>Looking over the patches I'm under the impression  that HBASE-4485 which
>is a subtask of HBASE-2856 was back ported through HBASE-4838 to 0.92 by
>Am I wrong?
>On 2/14/12 11:06 PM, "Amitanand Aiyer" <[EMAIL PROTECTED]> wrote:
>>Hi Cosmin,
>>  https://issues.apache.org/jira/browse/HBASE-4485 might be applicable.
>>  The patch was included in the fix for 2856.
>>From: Cosmin Lehene [[EMAIL PROTECTED]]
>>Sent: Tuesday, February 14, 2012 12:02 PM
>>Subject: Re: MR job "randomly" scans up thousands of rows less than the
>>it should.
>>I just got back on this issue. Initially the behavior we've seen (missing
>>rows) wouldn't reproduce on 0.90 using TestAcidGuarantees.
>>However, if the puts in the writer threads include additional rows the
>>scanners will start reading less rows. This reproduces consistently on
>>0.90 and seems to be working correctly on 0.92.
>>HBASE-2856/HBASE-4838 are probably the solution, although there's a
>>it's some other fix on 0.92 (ideas?)
>>We're undecided whether backporting to 0.90 vs upgrading the affected
>>clusters to 0.92 would be better?
>>Also is there interest for this fix on 0.90?
>>On 2/6/12 6:25 PM, "Cosmin Lehene" <[EMAIL PROTECTED]> wrote:
>>>Thanks Ted!
>>>I wonder if it would make more sense to port it to 0.90.X or upgrade to
>>>On 2/2/12 5:03 PM, "Ted Yu" <[EMAIL PROTECTED]> wrote:
>>>>HBASE-4838 ports HBASE-2856 to 0.92
>>>>On Thu, Feb 2, 2012 at 4:46 PM, Cosmin Lehene <[EMAIL PROTECTED]>
>>>>> (sorry for the damaged subject :))
>>>>> Hey Jon,
>>>>> We have two column families.
>>>>> There are no filters and there's a full table scan. We're not
>>>>> rows.
>>>>> I did see however a single time that we had one qualifier "fault" in
>>>>> job counters (it was missing, and it wasn't supposed to be missing).
>>>>> However that was only once and it doesn't happen when we encounter
>>>>> rows.
>>>>> We're getting this behavior consistently although I couldn't figure a
>>>>> to reproduce it. I'll try running multiple instances of the job in
>>>>> parallel to figure out if that would affect the outcome.
>>>>> I'll probably have to add more debugging for the affected rows and
>>>>> deeper.
>>>>> HBASE-2856 is a pretty large issue - do you think it could be related
>>>>> what I'm seeing? If so it could help me reproduce it.
>>>>> Thanks,
>>>>> Cosmin
>>>>> On 2/1/12 11:30 PM, "Jonathan Hsieh" <[EMAIL PROTECTED]> wrote:
>>>>> >Cosmin,
>>>>> >
>>>>> >How many column families to you have in this table?   Are you using
>>>>> >filters in you HBase scans?  Are you using skip rows that may not
>>>>> >qualifiers present?
>>>>> >
>>>>> >There are a few known issues with multi-CF atomicity and a recent
>>>>> >about
>>>>> >flushes that may be related to this problem.  There HBASE-2856, a
>>>>> >having to do with flushes which is pretty intricate and only in
>>>>> >
>>>>> >Jon.
>>>>> >
>>>>> >On Wed, Feb 1, 2012 at 8:46 PM, Cosmin Lehene <[EMAIL PROTECTED]>
>>>>> >
>>>>> >> We have a MR job that runs every few minutes on some time series
>>>>> >> which is continuously updated (never deleted).
>>>>> >> Every few (in the range of tens to hundreds) runs the map task