Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> Slow scanning for PrefixFilter on EncodedBlocks


+
J Mohamed Zahoor 2012-10-15, 15:21
+
J Mohamed Zahoor 2012-10-15, 17:27
+
lars hofhansl 2012-10-16, 07:21
+
lars hofhansl 2012-10-16, 21:39
Copy link to this message
-
Re: Slow scanning for PrefixFilter on EncodedBlocks
I reopened HBASE-6577

----- Original Message -----
From: lars hofhansl <[EMAIL PROTECTED]>
To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>; lars hofhansl <[EMAIL PROTECTED]>
Cc:
Sent: Tuesday, October 16, 2012 2:39 PM
Subject: Re: Slow scanning for PrefixFilter on EncodedBlocks

Looks like this is exactly the scenario I was trying to optimize with HBASE-6577. Hmm...
________________________________
From: lars hofhansl <[EMAIL PROTECTED]>
To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
Sent: Tuesday, October 16, 2012 12:21 AM
Subject: Re: Slow scanning for PrefixFilter on EncodedBlocks

PrefixFilter does not do any seeking by itself, so I doubt this is related to HBASE-6757.
Does this only happen with FAST_DIFF compression?
If you can create an isolated test program (that sets up the scenario and then runs a scan with the filter such that it is very slow), I'm happy to take a look.

-- Lars

----- Original Message -----
From: J Mohamed Zahoor <[EMAIL PROTECTED]>
To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
Cc:
Sent: Monday, October 15, 2012 10:27 AM
Subject: Re: Slow scanning for PrefixFilter on EncodedBlocks

Is this related to HBASE-6757 ?
I use a filter list with
  - prefix filter
  - filter list of column filters

/zahoor

On Monday, October 15, 2012, J Mohamed Zahoor wrote:

> Hi
>
> My scanner performance is very slow when using a Prefix filter on a
> **Encoded Column** ( encoded using FAST_DIFF on both memory and disk).
> I am using 94.1 hbase.
>
> jstack shows that much time is spent on seeking the row.
> Even if i give a exact row key match in the prefix filter it takes about
> two minutes to return a single row.
> Running this multiple times also seems to be redirecting things to disk
> (loadBlock).
>
>
> at
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$EncodedScannerV2.loadBlockAndSeekToKey(HFileReaderV2.java:1027)
> at
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:461)
>  at
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.reseekTo(HFileReaderV2.java:493)
> at
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:242)
>  at
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:167)
> at
> org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:54)
>  at
> org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:521)
> - locked <0x000000059584fab8> (a
> org.apache.hadoop.hbase.regionserver.StoreScanner)
>  at
> org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:402)
> - locked <0x000000059584fab8> (a
> org.apache.hadoop.hbase.regionserver.StoreScanner)
>  at
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRow(HRegion.java:3507)
> at
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:3455)
>  at
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3406)
> - locked <0x000000059589bb30> (a
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl)
>  at
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3423)
>
> If is set the start and end row as same row in scan ... it come in very
> quick.
>
> Saw this link
> http://search-hadoop.com/m/9f0JH1Kz24U1&subj=Re+HBase+0+94+2+SNAPSHOT+Scanning+Bug
> But it looks like things are fine in 94.1.
>
> Any pointers on why this is slow?
>
>
> Note: the row has not many columns(5 and less than a kb) and lots of
> versions (1500+)
>
> ./zahoor
>
>
>
+
J Mohamed Zahoor 2012-10-17, 08:42
+
J Mohamed Zahoor 2012-10-17, 08:44
+
anil gupta 2012-10-17, 16:41
+
lars hofhansl 2012-10-17, 18:11
+
anil gupta 2012-10-17, 19:25
+
lars hofhansl 2012-10-17, 22:35
+
J Mohamed Zahoor 2012-10-18, 07:45
+
Jerry Lam 2012-10-15, 17:43