Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - Re: Issues with delete markers


Copy link to this message
-
Re: Issues with delete markers
lars hofhansl 2013-07-01, 15:32
That is the easy part :)
The hard part is to add this to filters in a backwards compatible way.

-- Lars
----- Original Message -----
From: Varun Sharma <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]; lars hofhansl <[EMAIL PROTECTED]>
Cc: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
Sent: Monday, July 1, 2013 8:18 AM
Subject: Re: Issues with delete markers

I mean version tracking with delete markers...
On Mon, Jul 1, 2013 at 8:17 AM, Varun Sharma <[EMAIL PROTECTED]> wrote:

> So, yesterday, I implemented this change via a coprocessor which basically
> initiates a scan which is raw, keeps tracking of # of delete markers
> encountered and stops when a configured threshold is met. It instantiates
> its own ScanDeleteTracker to do the masking through delete markers. So raw
> scan, count delete markers/stop if too many encountered and mask them so to
> return sane stuff back to the client.
>
> I guess until now it has been working reasonably. Also, with HBase 8809,
> version tracking etc. should also work with filters now.
>
>
> On Mon, Jul 1, 2013 at 3:58 AM, lars hofhansl <[EMAIL PROTECTED]> wrote:
>
>> That would be quite dramatic change, we cannot pass delete markers to the
>> existing filters without confusing them.
>> We could invent a new method (filterDeleteKV or filterDeleteMarker or
>> something) on filters along with a new "filter type" that implements that
>> method.
>>
>>
>> -- Lars
>>
>>
>> ----- Original Message -----
>> From: Varun Sharma <[EMAIL PROTECTED]>
>> To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>; [EMAIL PROTECTED]
>> Cc:
>> Sent: Sunday, June 30, 2013 1:56 PM
>> Subject: Re: Issues with delete markers
>>
>> Sorry, typo, i meant that for user scans, should we be passing delete
>> markers through.the filters as well ?
>>
>> Varun
>>
>>
>> On Sun, Jun 30, 2013 at 1:03 PM, Varun Sharma <[EMAIL PROTECTED]>
>> wrote:
>>
>> > For user scans, i feel we should be passing delete markers through as
>> well.
>> >
>> >
>> > On Sun, Jun 30, 2013 at 12:35 PM, Varun Sharma <[EMAIL PROTECTED]
>> >wrote:
>> >
>> >> I tried this a little bit and it seems that filters are not called on
>> >> delete markers. For raw scans returning delete markers, does it make
>> sense
>> >> to do that ?
>> >>
>> >> Varun
>> >>
>> >>
>> >> On Sun, Jun 30, 2013 at 12:03 PM, Varun Sharma <[EMAIL PROTECTED]
>> >wrote:
>> >>
>> >>> Hi,
>> >>>
>> >>> We are having an issue with the way HBase does handling of deletes. We
>> >>> are looking to retrieve 300 columns in a row but the row has tens of
>> >>> thousands of delete markers in it before we span the 300 columns
>> something
>> >>> like this
>> >>>
>> >>>
>> >>> row  DeleteCol1 Col1  DeleteCol2 Col2 ................... DeleteCol3
>> Col3
>> >>>
>> >>> And so on. Therefore, the issue here, being that to retrieve these 300
>> >>> columns, we need to go through tens of thousands of deletes -
>> sometimes we
>> >>> get a spurt of these queries and that DDoSes a region server. We are
>> okay
>> >>> with saying, only return first 300 columns and stop once you
>> encounter, say
>> >>> 5K column delete markers or something.
>> >>>
>> >>> I wonder if such a construct is provided by HBase or do we need to
>> build
>> >>> something on top of the RAW scan and handle the delete masking there.
>> >>>
>> >>> Thanks
>> >>> Varun
>> >>>
>> >>>
>> >>>
>> >>
>> >
>>
>>
>