Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> Custom Filter and SEEK_NEXT_USING_HINT issue


+
Eugeny Morozov 2013-01-18, 23:28
+
Ted Yu 2013-01-18, 23:56
+
Eugeny Morozov 2013-01-19, 09:36
+
Ted 2013-01-19, 13:16
+
Eugeny Morozov 2013-01-20, 21:22
+
Michael Segel 2013-01-21, 00:22
+
Eugeny Morozov 2013-01-21, 08:16
+
ramkrishna vasudevan 2013-01-21, 08:56
+
Anoop Sam John 2013-01-21, 08:59
Copy link to this message
-
Re: Custom Filter and SEEK_NEXT_USING_HINT issue
Anoop, Ramkrishna

Thank you for explanation! I've got it.
On Mon, Jan 21, 2013 at 12:59 PM, Anoop Sam John <[EMAIL PROTECTED]> wrote:

> > I suppose if scanning process has started at once on
> all regions, then I would find in log files at least one value per region,
> but I have found one value per region only for those regions, that resides
> before the particular one.
>
> @Eugeny -  FuzzyFilter like any other filter works at the server side. The
> scanning from client side will be like sequential starting from the 1st
> region (Region with empty startkey or the corresponding region which
> contains the startkey whatever you mentioned in your scan). From client,
> request will go to RS for scanning a region. Once that region is over the
> next region will be contacted for scan(from client) and so on.  There is no
> parallel scanning of multiple regions from client side.  [This is when
> using a HTable scan APIs]
>
> When MR used for scanning, we will be doing parallel scans from all the
> regions. Here will be having mappers per region.  But the normal scan from
> client side will be sequential on the regions not parallel.
>
> -Anoop-
> ________________________________________
> From: Eugeny Morozov [[EMAIL PROTECTED]]
> Sent: Monday, January 21, 2013 1:46 PM
> To: [EMAIL PROTECTED]
> Cc: Alex Baranau
> Subject: Re: Custom Filter and SEEK_NEXT_USING_HINT issue
>
> Finally, the mystery has been solved.
>
> Small remark before I explain everything.
>
> The situation with only region is absolutely the same:
> Fzzy: AAAA1Q7iQ9JA
> Next fzzy: F7dtxwqVQ_Pw  <-- the value I'm trying to find.
> Fzzy: F7dt8QWPSIDw
> Somehow FuzzyRowFilter has just omit my value here.
>
>
> So, the explanation.
> In javadoc for FuzzyRowFilter question mark is used as substitution for
> unknown value. Of course it's possible to use anything including zero
> instead of question mark.
> For quite some time we used literals to encode our keys. Literals like
> you've seen already: AAAA1Q7iQ9JA or F7dt8QWPSIDw. But that's Base64 form
> of just 8 bytes, which requires 1.5 times more space. So we've decided to
> store raw version - just  byte[8]. But unfortunately the symbol '?' is
> exactly in the middle of the byte (according to ascii table
> http://www.asciitable.com/), which means with FuzzyRowFilter we skip half
> of values in some cases. In the same time question mark is exactly before
> any letter that could be used in key.
>
> Despite the fact we have integration tests - that's just a coincidence we
> haven't such an example in there.
>
> So, as an advice - always use zero instead of question mark for
> FuzzyRowFilter.
>
> Thank's to everyone!
>
> P.S. But the question with region scanning order is still here. I do not
> understand why with FuzzyFilter it goes from one region to another until it
> stops at the value. I suppose if scanning process has started at once on
> all regions, then I would find in log files at least one value per region,
> but I have found one value per region only for those regions, that resides
> before the particular one.
>
>
> On Mon, Jan 21, 2013 at 4:22 AM, Michael Segel <[EMAIL PROTECTED]
> >wrote:
>
> > If its the same class and its not a patch, then the first class loaded
> > wins.
> >
> > So if you have a Class Foo and HBase has a Class Foo, your code will
> never
> > see the light of day.
> >
> > Perhaps I'm stating the obvious but its something to think about when
> > working w Hadoop.
> >
> > On Jan 19, 2013, at 3:36 AM, Eugeny Morozov <[EMAIL PROTECTED]>
> > wrote:
> >
> > > Ted,
> > >
> > > that is correct.
> > > HBase 0.92.x and we use part of the patch 6509.
> > >
> > > I use the filter as a custom filter, it lives in separate jar file and
> > goes
> > > to HBase's classpath. I did not patch HBase.
> > > Moreover I do not use protobuf's descriptions that comes with the
> filter
> > in
> > > patch. Only two classes I have - FuzzyRowFilter itself and its test
> > class.
> > >
> >
Evgeny Morozov
Developer Grid Dynamics
Skype: morozov.evgeny
www.griddynamics.com
[EMAIL PROTECTED]