We have preCompactScannerOpen() and preCompact() hooks..
As we said, for compaction, a scanner for reading all corresponding HFiles ( all HFiles in major compaction) will be created and scan via that scanner.. ( calling next() methods).. The kernel will do this way..
Now using these hooks you can create a wrapper over the actual scanner... In fact you can use preCompact() hook(I think that is fine for you).. By the time this is being called, the actual scanner is made and will get that object passed to your hook... You can create a custom scanner impl and wrap the actual scanner within that and return the new wrapper scanner from your post hook.. [Yes its return type is InternalScanner] The actual scanner you can use as a delegator to do the actual scanning purpose... Now all the KVs ( which the underlying scanner passed) will flow via ur new wrapper scanner where you can avoid certain KVs based on your condition or logic
Core WrapperScannerImpl Actual Scanner (created by core)
-> next(List<KeyValue>) -> next(List<KeyValue>)
<- Do the real scan from HFiles
See List KVs and remove
those u dont want
Only the passed
KVs come in final
Hope I make it clear for you :)
Note : - preCompactScannerOpen() will be called before even creating the actual scanner while preCompact() after this scanner creation.. You can see the code in Store#compactStore()
From: yun peng [[EMAIL PROTECTED]]
Sent: Wednesday, October 17, 2012 9:04 PM
To: [EMAIL PROTECTED]
Subject: Re: Where is code in hbase that physically delete a record?
Hi, Ram and Anoop, Thanks for the nice reference on the java file, which I
will check through.
It is interesting to know about the recent feature on
preCompactScannerOpen() hook. Ram, it would be nice if I can know how to
specify conditions like c1 = 'a'. I have also checked the example code in
hbase 6496 link <https://issues.apache.org/jira/browse/HBASE-6496>. which
show how to delete data before time as in a on-demand specification...
On Wed, Oct 17, 2012 at 8:46 AM, Ramkrishna.S.Vasudevan <
[EMAIL PROTECTED]> wrote:
> Also to see the code how the delete happens pls refer to StoreScanner.java
> and how the ScanQueryMatcher.match() works.
> That is where we decide if any kv has to be avoided due to already deleted
> tombstone marker.
> Forgot to tell you about this.
> > -----Original Message-----
> > From: yun peng [mailto:[EMAIL PROTECTED]]
> > Sent: Wednesday, October 17, 2012 5:54 PM
> > To: [EMAIL PROTECTED]
> > Subject: Where is code in hbase that physically delete a record?
> > Hi, All,
> > I want to find internal code in hbase where physical deleting a record
> > occurs.
> > -some of my understanding.
> > Correct me if I am wrong. (It is largely based on my experience and
> > even
> > speculation.) Logically deleting a KeyValue data in hbase is performed
> > by
> > marking tombmarker (by Delete() per records) or setting TTL/max_version
> > (per Store). After these actions, however, the physical data are still
> > there, somewhere in the system. Physically deleting a record in hbase
> > is
> > realised by *a scanner to discard a keyvalue data record* during the
> > major_compact.
> > -what I need
> > I want to extend hbase to associate some actions with physically
> > deleting a
> > record. Does hbase provide such hook (or coprocessor API) to inject
> > code
> > for each KV record that is skipped by hbase storescanner in
> > major_compact.
> > If not, anyone knows where should I look into in hbase (-0.94.2) for
> > such
> > code modification?
> > Thanks.
> > Yun