Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> Re: Issues with delete markers


+
lars hofhansl 2013-07-01, 10:58
Copy link to this message
-
Re: Issues with delete markers
So, yesterday, I implemented this change via a coprocessor which basically
initiates a scan which is raw, keeps tracking of # of delete markers
encountered and stops when a configured threshold is met. It instantiates
its own ScanDeleteTracker to do the masking through delete markers. So raw
scan, count delete markers/stop if too many encountered and mask them so to
return sane stuff back to the client.

I guess until now it has been working reasonably. Also, with HBase 8809,
version tracking etc. should also work with filters now.
On Mon, Jul 1, 2013 at 3:58 AM, lars hofhansl <[EMAIL PROTECTED]> wrote:

> That would be quite dramatic change, we cannot pass delete markers to the
> existing filters without confusing them.
> We could invent a new method (filterDeleteKV or filterDeleteMarker or
> something) on filters along with a new "filter type" that implements that
> method.
>
>
> -- Lars
>
>
> ----- Original Message -----
> From: Varun Sharma <[EMAIL PROTECTED]>
> To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>; [EMAIL PROTECTED]
> Cc:
> Sent: Sunday, June 30, 2013 1:56 PM
> Subject: Re: Issues with delete markers
>
> Sorry, typo, i meant that for user scans, should we be passing delete
> markers through.the filters as well ?
>
> Varun
>
>
> On Sun, Jun 30, 2013 at 1:03 PM, Varun Sharma <[EMAIL PROTECTED]> wrote:
>
> > For user scans, i feel we should be passing delete markers through as
> well.
> >
> >
> > On Sun, Jun 30, 2013 at 12:35 PM, Varun Sharma <[EMAIL PROTECTED]
> >wrote:
> >
> >> I tried this a little bit and it seems that filters are not called on
> >> delete markers. For raw scans returning delete markers, does it make
> sense
> >> to do that ?
> >>
> >> Varun
> >>
> >>
> >> On Sun, Jun 30, 2013 at 12:03 PM, Varun Sharma <[EMAIL PROTECTED]
> >wrote:
> >>
> >>> Hi,
> >>>
> >>> We are having an issue with the way HBase does handling of deletes. We
> >>> are looking to retrieve 300 columns in a row but the row has tens of
> >>> thousands of delete markers in it before we span the 300 columns
> something
> >>> like this
> >>>
> >>>
> >>> row  DeleteCol1 Col1  DeleteCol2 Col2 ................... DeleteCol3
> Col3
> >>>
> >>> And so on. Therefore, the issue here, being that to retrieve these 300
> >>> columns, we need to go through tens of thousands of deletes -
> sometimes we
> >>> get a spurt of these queries and that DDoSes a region server. We are
> okay
> >>> with saying, only return first 300 columns and stop once you
> encounter, say
> >>> 5K column delete markers or something.
> >>>
> >>> I wonder if such a construct is provided by HBase or do we need to
> build
> >>> something on top of the RAW scan and handle the delete masking there.
> >>>
> >>> Thanks
> >>> Varun
> >>>
> >>>
> >>>
> >>
> >
>
>
+
Varun Sharma 2013-07-01, 15:18
+
lars hofhansl 2013-07-01, 15:32