Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # dev >> MR job "randomly" scans up thousands of rows less than the it should.


Copy link to this message
-
Re: MR job "randomly" scans up thousands of rows less than the it should.
Thanks Ted!

I wonder if it would make more sense to port it to 0.90.X or upgrade to
0.92.

Cosmin

On 2/2/12 5:03 PM, "Ted Yu" <[EMAIL PROTECTED]> wrote:

>HBASE-4838 ports HBASE-2856 to 0.92
>
>FYI
>
>On Thu, Feb 2, 2012 at 4:46 PM, Cosmin Lehene <[EMAIL PROTECTED]> wrote:
>
>> (sorry for the damaged subject :))
>>
>>
>> Hey Jon,
>> We have two column families.
>> There are no filters and there's a full table scan. We're not skipping
>> rows.
>> I did see however a single time that we had one qualifier "fault" in the
>> job counters (it was missing, and it wasn't supposed to be missing).
>> However that was only once and it doesn't happen when we encounter
>>missing
>> rows.
>>
>> We're getting this behavior consistently although I couldn't figure a
>>way
>> to reproduce it. I'll try running multiple instances of the job in
>> parallel to figure out if that would affect the outcome.
>> I'll probably have to add more debugging for the affected rows and dig
>> deeper.
>>
>> HBASE-2856 is a pretty large issue - do you think it could be related to
>> what I'm seeing? If so it could help me reproduce it.
>>
>> Thanks,
>> Cosmin
>>
>>
>>
>>
>> On 2/1/12 11:30 PM, "Jonathan Hsieh" <[EMAIL PROTECTED]> wrote:
>>
>> >Cosmin,
>> >
>> >How many column families to you have in this table?   Are you using any
>> >filters in you HBase scans?  Are you using skip rows that may not have
>> >qualifiers present?
>> >
>> >There are a few known issues with multi-CF atomicity and a recent one
>> >about
>> >flushes that may be related to this problem.  There HBASE-2856, a fix
>> >having to do with flushes which is pretty intricate and only in 0.92.
>> >
>> >Jon.
>> >
>> >On Wed, Feb 1, 2012 at 8:46 PM, Cosmin Lehene <[EMAIL PROTECTED]>
>>wrote:
>> >
>> >> We have a MR job that runs every few minutes on some time series data
>> >> which is continuously updated (never deleted).
>> >> Every few (in the range of tens to hundreds) runs the map task that
>> >>covers
>> >> the last region will get fewer input records (off by 500-5000 rows)
>> >>without
>> >> any splits happening. This lower number of input records could
>>persist
>> >>for
>> >> a few MR runs, but will eventually get back to the "correct" value.
>> >>
>> >> This drop can be seen both in the "map input records" metric but it's
>> >> correlated with the metrics that get computed by the MR job (so it's
>> >>not a
>> >> MR counter bug).
>> >>
>> >> There are no exceptions in the MR job, or in the region server and
>>this
>> >> doesn't seem to be correlated with any compaction, split or region
>> >>movement.
>> >> The only "variable" in this scenario is that new data gets injected
>> >> continuously (and the actual MR job which is idempotent)
>> >>
>> >> This entire puzzle takes place on  HBase 0.90.5 ­ish (12 dec 2011) on
>> >>top
>> >> of Hadoop cdh3u2.
>> >>
>> >> Cosmin
>> >>
>> >>
>> >>
>> >>
>> >
>> >
>> >--
>> >// Jonathan Hsieh (shay)
>> >// Software Engineer, Cloudera
>> >// [EMAIL PROTECTED]
>>
>>

NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB