Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Custom Filter and SEEK_NEXT_USING_HINT issue


Copy link to this message
-
Custom Filter and SEEK_NEXT_USING_HINT issue
Hi, folks!

HBase, Hadoop, etc version is CDH-4.1.2

I'm using custom FuzzyRowFilter, which I get from
http://blog.sematext.com/2012/08/09/consider-using-fuzzyrowfilter-when-in-need-for-secondary-indexes-in-hbase/and
suddenly after quite a time we found that it starts loosing data.

Basically the idea of FuzzyRowFilter is that it tries to find key that has
been provided and if there is no such a key - but more exists in table - it
returns SEEK_NEXT_USING_HINT. And in getNextKeyHint(...) it builds required
key. As I understand, HBase in this key will fast-forward to required key -
it must be similar or same as to get Scan with setStartRow.

I'm trying to find key F7dt8QWPSIDw, it is definitely in HBase - I'm able
to get it using Scan.setStartRow.
For FuzzyFilter I'm using empty Scan - I didn't specify start row, stop row
or anything related.
That's what happening:

Fzzy: AAAA1Q7iQ9JA
Next fzzy: F7dtxwqVQ_Pw
Fzzy: AQAAnA96rxTg
Next fzzy: F7dtxwqVQ_Pw
Fzzy: AgAADQWPSIDw
Next fzzy: F7dtxwqVQ_Pw
Fzzy: AwAA-Q33Zb9Q
Next fzzy: F7dtxwqVQ_Pw
Fzzy: BAAAOg8oyu7A
Next fzzy: F7dtxwqVQ_Pw
Fzzy: BQAA9gqVQrTw
Next fzzy: F7dtxwqVQ_Pw
Fzzy: BgABZQ7iQ9JA
Next fzzy: F7dtxwqVQ_Pw
Fzzy: BwAAbgrpAojg
Next fzzy: F7dtxwqVQ_Pw
Fzzy: CAAAUQWPSIDw
Next fzzy: F7dtxwqVQ_Pw
Fzzy: CQABVgqVQrTw
Next fzzy: F7dtxwqVQ_Pw
Fzzy: CgAAOQ7iQ9JA
Next fzzy: F7dtxwqVQ_Pw
Fzzy: CwAALwqVQrTw
Next fzzy: F7dtxwqVQ_Pw
Fzzy: DAAAMwWPSIDw
Next fzzy: F7dtxwqVQ_Pw
Fzzy: DQAADgjqzsIQ
Next fzzy: F7dtxwqVQ_Pw
Fzzy: DgAAOgCcWv9g
Next fzzy: F7dtxwqVQ_Pw
Fzzy: DwAAKg7iQ9JA
Next fzzy: F7dtxwqVQ_Pw
Fzzy: EAAAugqVQrTw
Next fzzy: F7dtxwqVQ_Pw
Fzzy: EQAAJAqVQrTw
Next fzzy: F7dtxwqVQ_Pw
Fzzy: EgAABgIOMBgg
Next fzzy: F7dtxwqVQ_Pw
Fzzy: EwAAEwqVQrTw
Next fzzy: F7dtxwqVQ_Pw
Fzzy: FAAACQqVQrTw
Next fzzy: F7dtxwqVQ_Pw
Fzzy: FQAAIAqVQrTw
Next fzzy: F7dtxwqVQ_Pw
Fzzy: FgAAeAWPSIDw
Next fzzy: F7dtxwqVQ_Pw
Fzzy: FwAAAw33Zb9Q
Next fzzy: F7dtxwqVQ_Pw
Fzzy: F7dt8QWPSIDw

It's obvious that my FuzzyRowFilter knows what to search and every time it
repeats its question.
The very first key - I suppose is just the first key of a region where my
key is located.
The very last key - is the key that is already bigger than what I'm trying
to find - that's the reason why FuzzyFilter stopped there.

Do you know any issue with SEEK_NEXT_USING_HINT? I've searched, but
unsuccessfully.
Do you have any idea how to explain these many trials?

Thanks in advance.
--
Evgeny Morozov
Developer Grid Dynamics
Skype: morozov.evgeny
www.griddynamics.com
[EMAIL PROTECTED]
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB