Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # dev - HBase-0.94.2-SNAPSHOT Scanning Bug


Copy link to this message
-
Re: HBase-0.94.2-SNAPSHOT Scanning Bug
Jean-Daniel Cryans 2012-08-24, 01:43
We use thrift's scanOpenWithPrefix which does this:

        Filter f = new WhileMatchFilter(
            new PrefixFilter(getBytes(startAndPrefix)));

So that might just be it.

J-D

On Thu, Aug 23, 2012 at 6:34 PM, lars hofhansl <[EMAIL PROTECTED]> wrote:
> Do you use a row filter that implementes filterRowKey in your production cluster?
> Looks like it only happens then, which is ironic, because that is the case I was trying to optimize (when the filter decides the row is filtered it should immediately seek to the next row, rather than iterating through all version of all all remaining columns of the current row).
>
> Oh well.
>
>
> -- Lars
>
>
>
> ----- Original Message -----
> From: Jean-Daniel Cryans <[EMAIL PROTECTED]>
> To: [EMAIL PROTECTED]; lars hofhansl <[EMAIL PROTECTED]>
> Cc:
> Sent: Thursday, August 23, 2012 6:28 PM
> Subject: Re: HBase-0.94.2-SNAPSHOT Scanning Bug
>
> We tried reproducing on a local node but it doesn't show up. It did
> show as soon as we put it on our dev cluster.
>
> J-D
>
> On Thu, Aug 23, 2012 at 6:10 PM, lars hofhansl <[EMAIL PROTECTED]> wrote:
>> It is interesting, though, because I have been running my local perf testing with this change included and have not seen this issue.
>>
>> -- Lars
>>
>>
>>
>> ----- Original Message -----
>> From: lars hofhansl <[EMAIL PROTECTED]>
>> To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
>> Cc:
>> Sent: Thursday, August 23, 2012 6:05 PM
>> Subject: Re: HBase-0.94.2-SNAPSHOT Scanning Bug
>>
>> This:
>>
>> "IPC Server handler 43 on 10304" daemon prio=10 tid=0x00007f16b8b1f000 nid=0x6414 runnable [0x00007f16b47c6000]
>>    java.lang.Thread.State: RUNNABLE
>>     at org.apache.hadoop.hbase.KeyValue.createFirstOnRowColTS(KeyValue.java:1893)
>>     at org.apache.hadoop.hbase.regionserver.StoreFileScanner.requestSeek(StoreFileScanner.java:310)
>>     at org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:297)
>>     at org.apache.hadoop.hbase.regionserver.KeyValueHeap.requestSeek(KeyValueHeap.java:256)
>>     at org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:522)
>>     - locked <0x00000006cec5bd88> (a org.apache.hadoop.hbase.regionserver.StoreScanner)
>>     at org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:54)
>>     at org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.requestSeek(NonLazyKeyValueScanner.java:38)
>>     at org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:297)
>>     at org.apache.hadoop.hbase.regionserver.KeyValueHeap.requestSeek(KeyValueHeap.java:256)
>>     at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRow(HRegion.java:3508)
>>
>> points to my change: https://issues.apache.org/jira/browse/HBASE-6577
>>
>> The trace is interesting: RegionScannerImpl.nextRow now seeks to the last KV in the row and then iterates as before.
>> However, then the reseek internally seeks to the first KV of the column, and somehow this interaction makes no progress forward.
>>
>> I'll revert that change.
>>
>> -- Lars
>>
>>
>> ________________________________
>> From: Elliott Clark <[EMAIL PROTECTED]>
>> To: [EMAIL PROTECTED]
>> Sent: Thursday, August 23, 2012 5:39 PM
>> Subject: HBase-0.94.2-SNAPSHOT Scanning Bug
>>
>> I recently tried to update one of our clusters to a version of 0.94.2
>> seen here: https://github.com/stumbleupon/hbase/commits/su_prod_94
>>
>> When doing that all of the nodes started taking all available cpu
>> time.  Not much interesting was in the logs however jstacks looked
>> like this: http://pastebin.com/raw.php?i=fw6P5RKE  Everything is
>> spinning in scans.  A version of 0.94.1 works perfectly and reverting
>> solved all issues.  I don't really have enough data to point at any
>> jira as the cause I was just wondering if anyone had some insight into
>> the few commits between 0.94.1 release and the head of the above
>> github that could cause scans to spin.