Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Questions on FuzzyRowFilter


Copy link to this message
-
Re: Questions on FuzzyRowFilter
Hi Mike,
I agree with you - the way you've outlined is exactly the way Phoenix has
implemented it. It's a bit of a problem with terminology, though. We call
it salting: http://phoenix.incubator.apache.org/salted.html. We hash the
key, mod the hash with the SALT_BUCKET value you provide, and prepend the
row key with this single byte value. Maybe you can coin a good term for
this technique?

FWIW, you don't lose the ability to do a range scan when you salt (or
hash-the-key and mod by the number of "buckets"), but you do need to run a
scan for each possible value of your salt byte (0 - SALT_BUCKET-1). Then
the client does a merge sort among these scans. It performs well.

Thanks,
James
On Fri, May 9, 2014 at 11:57 PM, Michael Segel <[EMAIL PROTECTED]>wrote: