Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> Re: Issues with delete markers


+
lars hofhansl 2013-07-01, 10:58
+
Varun Sharma 2013-07-01, 15:17
Copy link to this message
-
Re: Issues with delete markers
I mean version tracking with delete markers...
On Mon, Jul 1, 2013 at 8:17 AM, Varun Sharma <[EMAIL PROTECTED]> wrote:

> So, yesterday, I implemented this change via a coprocessor which basically
> initiates a scan which is raw, keeps tracking of # of delete markers
> encountered and stops when a configured threshold is met. It instantiates
> its own ScanDeleteTracker to do the masking through delete markers. So raw
> scan, count delete markers/stop if too many encountered and mask them so to
> return sane stuff back to the client.
>
> I guess until now it has been working reasonably. Also, with HBase 8809,
> version tracking etc. should also work with filters now.
>
>
> On Mon, Jul 1, 2013 at 3:58 AM, lars hofhansl <[EMAIL PROTECTED]> wrote:
>
>> That would be quite dramatic change, we cannot pass delete markers to the
>> existing filters without confusing them.
>> We could invent a new method (filterDeleteKV or filterDeleteMarker or
>> something) on filters along with a new "filter type" that implements that
>> method.
>>
>>
>> -- Lars
>>
>>
>> ----- Original Message -----
>> From: Varun Sharma <[EMAIL PROTECTED]>
>> To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>; [EMAIL PROTECTED]
>> Cc:
>> Sent: Sunday, June 30, 2013 1:56 PM
>> Subject: Re: Issues with delete markers
>>
>> Sorry, typo, i meant that for user scans, should we be passing delete
>> markers through.the filters as well ?
>>
>> Varun
>>
>>
>> On Sun, Jun 30, 2013 at 1:03 PM, Varun Sharma <[EMAIL PROTECTED]>
>> wrote:
>>
>> > For user scans, i feel we should be passing delete markers through as
>> well.
>> >
>> >
>> > On Sun, Jun 30, 2013 at 12:35 PM, Varun Sharma <[EMAIL PROTECTED]
>> >wrote:
>> >
>> >> I tried this a little bit and it seems that filters are not called on
>> >> delete markers. For raw scans returning delete markers, does it make
>> sense
>> >> to do that ?
>> >>
>> >> Varun
>> >>
>> >>
>> >> On Sun, Jun 30, 2013 at 12:03 PM, Varun Sharma <[EMAIL PROTECTED]
>> >wrote:
>> >>
>> >>> Hi,
>> >>>
>> >>> We are having an issue with the way HBase does handling of deletes. We
>> >>> are looking to retrieve 300 columns in a row but the row has tens of
>> >>> thousands of delete markers in it before we span the 300 columns
>> something
>> >>> like this
>> >>>
>> >>>
>> >>> row  DeleteCol1 Col1  DeleteCol2 Col2 ................... DeleteCol3
>> Col3
>> >>>
>> >>> And so on. Therefore, the issue here, being that to retrieve these 300
>> >>> columns, we need to go through tens of thousands of deletes -
>> sometimes we
>> >>> get a spurt of these queries and that DDoSes a region server. We are
>> okay
>> >>> with saying, only return first 300 columns and stop once you
>> encounter, say
>> >>> 5K column delete markers or something.
>> >>>
>> >>> I wonder if such a construct is provided by HBase or do we need to
>> build
>> >>> something on top of the RAW scan and handle the delete masking there.
>> >>>
>> >>> Thanks
>> >>> Varun
>> >>>
>> >>>
>> >>>
>> >>
>> >
>>
>>
>
+
lars hofhansl 2013-07-01, 15:32
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB