|
yun peng
2012-10-17, 12:24
Anoop Sam John
2012-10-17, 12:31
Ramkrishna.S.Vasudevan
2012-10-17, 12:42
Ramkrishna.S.Vasudevan
2012-10-17, 12:46
yun peng
2012-10-17, 15:34
Anoop Sam John
2012-10-18, 03:32
Ramkrishna.S.Vasudevan
2012-10-18, 04:20
PG
2012-10-19, 19:41
Anoop John
2012-10-20, 03:08
ramkrishna vasudevan
2012-10-20, 05:50
|
-
Where is code in hbase that physically delete a record?yun peng 2012-10-17, 12:24
Hi, All,
I want to find internal code in hbase where physical deleting a record occurs. -some of my understanding. Correct me if I am wrong. (It is largely based on my experience and even speculation.) Logically deleting a KeyValue data in hbase is performed by marking tombmarker (by Delete() per records) or setting TTL/max_version (per Store). After these actions, however, the physical data are still there, somewhere in the system. Physically deleting a record in hbase is realised by *a scanner to discard a keyvalue data record* during the major_compact. -what I need I want to extend hbase to associate some actions with physically deleting a record. Does hbase provide such hook (or coprocessor API) to inject code for each KV record that is skipped by hbase storescanner in major_compact. If not, anyone knows where should I look into in hbase (-0.94.2) for such code modification? Thanks. Yun
-
RE: Where is code in hbase that physically delete a record?Anoop Sam John 2012-10-17, 12:31
You can see the code in ScanQueryMatcher
Basically in major compact a scan will be happening scanning all the files... As per the delete markers, the deleted KVs wont come out of the scanner and thus gets eliminated. Also in case of major compact the delete markers itself will get deleted ( Still some more complicated conditions are there though for these like keep deleted cells and time to purge deletes etc) I would say check the code in that class... -Anoop- ________________________________________ From: yun peng [[EMAIL PROTECTED]] Sent: Wednesday, October 17, 2012 5:54 PM To: [EMAIL PROTECTED] Subject: Where is code in hbase that physically delete a record? Hi, All, I want to find internal code in hbase where physical deleting a record occurs. -some of my understanding. Correct me if I am wrong. (It is largely based on my experience and even speculation.) Logically deleting a KeyValue data in hbase is performed by marking tombmarker (by Delete() per records) or setting TTL/max_version (per Store). After these actions, however, the physical data are still there, somewhere in the system. Physically deleting a record in hbase is realised by *a scanner to discard a keyvalue data record* during the major_compact. -what I need I want to extend hbase to associate some actions with physically deleting a record. Does hbase provide such hook (or coprocessor API) to inject code for each KV record that is skipped by hbase storescanner in major_compact. If not, anyone knows where should I look into in hbase (-0.94.2) for such code modification? Thanks. Yun
-
RE: Where is code in hbase that physically delete a record?Ramkrishna.S.Vasudevan 2012-10-17, 12:42
Hi Yun
Logically deleting a KeyValue data in hbase is performed > by > marking tombmarker (by Delete() per records) or setting TTL/max_version > (per Store). After these actions, however, the physical data are still > there, somewhere in the system. Physically deleting a record in hbase > is > realised by *a scanner to discard a keyvalue data record* during the > major_compact. Yes correct. As you understood correctly the major_compact will try to avoid the deleted records when the kvs are copied Into a new file by major compaction. In 0.94.2 there are some new hooks added like preCompactScannerOpen where you can write your own scanner implementation. But this will help you to write custom logic of which KVs to avoid during compaction. For eg, say you don't want any KV where any specific col c1 'a'. Then you can write your scanner and pass it thro preCompactScanneOpen. But suppose the system itself is trying to avoid the kvs that got deleted then currently there is no hook provided in CP to get those values specifically. Hope this helps. Regards Ram > -----Original Message----- > From: yun peng [mailto:[EMAIL PROTECTED]] > Sent: Wednesday, October 17, 2012 5:54 PM > To: [EMAIL PROTECTED] > Subject: Where is code in hbase that physically delete a record? > > Hi, All, > I want to find internal code in hbase where physical deleting a record > occurs. > > -some of my understanding. > Correct me if I am wrong. (It is largely based on my experience and > even > speculation.) Logically deleting a KeyValue data in hbase is performed > by > marking tombmarker (by Delete() per records) or setting TTL/max_version > (per Store). After these actions, however, the physical data are still > there, somewhere in the system. Physically deleting a record in hbase > is > realised by *a scanner to discard a keyvalue data record* during the > major_compact. > > -what I need > I want to extend hbase to associate some actions with physically > deleting a > record. Does hbase provide such hook (or coprocessor API) to inject > code > for each KV record that is skipped by hbase storescanner in > major_compact. > If not, anyone knows where should I look into in hbase (-0.94.2) for > such > code modification? > > Thanks. > Yun
-
RE: Where is code in hbase that physically delete a record?Ramkrishna.S.Vasudevan 2012-10-17, 12:46
Also to see the code how the delete happens pls refer to StoreScanner.java
and how the ScanQueryMatcher.match() works. That is where we decide if any kv has to be avoided due to already deleted tombstone marker. Forgot to tell you about this. Regards Ram > -----Original Message----- > From: yun peng [mailto:[EMAIL PROTECTED]] > Sent: Wednesday, October 17, 2012 5:54 PM > To: [EMAIL PROTECTED] > Subject: Where is code in hbase that physically delete a record? > > Hi, All, > I want to find internal code in hbase where physical deleting a record > occurs. > > -some of my understanding. > Correct me if I am wrong. (It is largely based on my experience and > even > speculation.) Logically deleting a KeyValue data in hbase is performed > by > marking tombmarker (by Delete() per records) or setting TTL/max_version > (per Store). After these actions, however, the physical data are still > there, somewhere in the system. Physically deleting a record in hbase > is > realised by *a scanner to discard a keyvalue data record* during the > major_compact. > > -what I need > I want to extend hbase to associate some actions with physically > deleting a > record. Does hbase provide such hook (or coprocessor API) to inject > code > for each KV record that is skipped by hbase storescanner in > major_compact. > If not, anyone knows where should I look into in hbase (-0.94.2) for > such > code modification? > > Thanks. > Yun
-
Re: Where is code in hbase that physically delete a record?yun peng 2012-10-17, 15:34
Hi, Ram and Anoop, Thanks for the nice reference on the java file, which I
will check through. It is interesting to know about the recent feature on preCompactScannerOpen() hook. Ram, it would be nice if I can know how to specify conditions like c1 = 'a'. I have also checked the example code in hbase 6496 link <https://issues.apache.org/jira/browse/HBASE-6496>. which show how to delete data before time as in a on-demand specification... Cheers, Yun On Wed, Oct 17, 2012 at 8:46 AM, Ramkrishna.S.Vasudevan < [EMAIL PROTECTED]> wrote: > Also to see the code how the delete happens pls refer to StoreScanner.java > and how the ScanQueryMatcher.match() works. > > That is where we decide if any kv has to be avoided due to already deleted > tombstone marker. > > Forgot to tell you about this. > > Regards > Ram > > > -----Original Message----- > > From: yun peng [mailto:[EMAIL PROTECTED]] > > Sent: Wednesday, October 17, 2012 5:54 PM > > To: [EMAIL PROTECTED] > > Subject: Where is code in hbase that physically delete a record? > > > > Hi, All, > > I want to find internal code in hbase where physical deleting a record > > occurs. > > > > -some of my understanding. > > Correct me if I am wrong. (It is largely based on my experience and > > even > > speculation.) Logically deleting a KeyValue data in hbase is performed > > by > > marking tombmarker (by Delete() per records) or setting TTL/max_version > > (per Store). After these actions, however, the physical data are still > > there, somewhere in the system. Physically deleting a record in hbase > > is > > realised by *a scanner to discard a keyvalue data record* during the > > major_compact. > > > > -what I need > > I want to extend hbase to associate some actions with physically > > deleting a > > record. Does hbase provide such hook (or coprocessor API) to inject > > code > > for each KV record that is skipped by hbase storescanner in > > major_compact. > > If not, anyone knows where should I look into in hbase (-0.94.2) for > > such > > code modification? > > > > Thanks. > > Yun > >
-
RE: Where is code in hbase that physically delete a record?Anoop Sam John 2012-10-18, 03:32
Hi Yun,
We have preCompactScannerOpen() and preCompact() hooks.. As we said, for compaction, a scanner for reading all corresponding HFiles ( all HFiles in major compaction) will be created and scan via that scanner.. ( calling next() methods).. The kernel will do this way.. Now using these hooks you can create a wrapper over the actual scanner... In fact you can use preCompact() hook(I think that is fine for you).. By the time this is being called, the actual scanner is made and will get that object passed to your hook... You can create a custom scanner impl and wrap the actual scanner within that and return the new wrapper scanner from your post hook.. [Yes its return type is InternalScanner] The actual scanner you can use as a delegator to do the actual scanning purpose... Now all the KVs ( which the underlying scanner passed) will flow via ur new wrapper scanner where you can avoid certain KVs based on your condition or logic Core WrapperScannerImpl Actual Scanner (created by core) -> next(List<KeyValue>) -> next(List<KeyValue>) <- Do the real scan from HFiles See List KVs and remove those u dont want <- Only the passed KVs come in final merged file Hope I make it clear for you :) Note : - preCompactScannerOpen() will be called before even creating the actual scanner while preCompact() after this scanner creation.. You can see the code in Store#compactStore() -Anoop- ________________________________________ From: yun peng [[EMAIL PROTECTED]] Sent: Wednesday, October 17, 2012 9:04 PM To: [EMAIL PROTECTED] Subject: Re: Where is code in hbase that physically delete a record? Hi, Ram and Anoop, Thanks for the nice reference on the java file, which I will check through. It is interesting to know about the recent feature on preCompactScannerOpen() hook. Ram, it would be nice if I can know how to specify conditions like c1 = 'a'. I have also checked the example code in hbase 6496 link <https://issues.apache.org/jira/browse/HBASE-6496>. which show how to delete data before time as in a on-demand specification... Cheers, Yun On Wed, Oct 17, 2012 at 8:46 AM, Ramkrishna.S.Vasudevan < [EMAIL PROTECTED]> wrote: > Also to see the code how the delete happens pls refer to StoreScanner.java > and how the ScanQueryMatcher.match() works. > > That is where we decide if any kv has to be avoided due to already deleted > tombstone marker. > > Forgot to tell you about this. > > Regards > Ram > > > -----Original Message----- > > From: yun peng [mailto:[EMAIL PROTECTED]] > > Sent: Wednesday, October 17, 2012 5:54 PM > > To: [EMAIL PROTECTED] > > Subject: Where is code in hbase that physically delete a record? > > > > Hi, All, > > I want to find internal code in hbase where physical deleting a record > > occurs. > > > > -some of my understanding. > > Correct me if I am wrong. (It is largely based on my experience and > > even > > speculation.) Logically deleting a KeyValue data in hbase is performed > > by > > marking tombmarker (by Delete() per records) or setting TTL/max_version > > (per Store). After these actions, however, the physical data are still > > there, somewhere in the system. Physically deleting a record in hbase > > is > > realised by *a scanner to discard a keyvalue data record* during the > > major_compact. > > > > -what I need > > I want to extend hbase to associate some actions with physically > > deleting a > > record. Does hbase provide such hook (or coprocessor API) to inject > > code > > for each KV record that is skipped by hbase storescanner in > > major_compact. > > If not, anyone knows where should I look into in hbase (-0.94.2) for > > such > > code modification? > > > > Thanks. > > Yun > >
-
RE: Where is code in hbase that physically delete a record?Ramkrishna.S.Vasudevan 2012-10-18, 04:20
Hi Yun
Hope Anoop's clear explanation will help you. Just to add on, after you wrap the StoreScanner in your Custome Scanner Impl you will invoke the next(List<KeyValue>) on the delegator(here the delegator is the actual StoreScanner). The delegator will give you the KV list that it has fetched from underlying Scanners (Memstore and StoreFileScanner). Now on the returned kv you can do a check say if the KV has a column C1 and its value is 'a', just skip it so that this scanner does not send the kv to the actual Scanner on the outside of the custom Scanner which the compaction tries to use. The Code may look lik this Class CustomScanner implements InternalScanner{ StoreScanner delegate; Public CustomScanner(){ Delegate = new SToreScanner(); Public boolean next(List<KeyValue>kv) { delegate.next(kv); foreach(kv){ //Do necessary filtering here. } } } Regards Ram > -----Original Message----- > From: Anoop Sam John [mailto:[EMAIL PROTECTED]] > Sent: Thursday, October 18, 2012 9:02 AM > To: [EMAIL PROTECTED] > Subject: RE: Where is code in hbase that physically delete a record? > > Hi Yun, > We have preCompactScannerOpen() and preCompact() hooks.. > As we said, for compaction, a scanner for reading all corresponding > HFiles ( all HFiles in major compaction) will be created and scan via > that scanner.. ( calling next() methods).. The kernel will do this > way.. > Now using these hooks you can create a wrapper over the actual > scanner... In fact you can use preCompact() hook(I think that is fine > for you).. By the time this is being called, the actual scanner is > made and will get that object passed to your hook... You can create a > custom scanner impl and wrap the actual scanner within that and return > the new wrapper scanner from your post hook.. [Yes its return type is > InternalScanner] The actual scanner you can use as a delegator to do > the actual scanning purpose... Now all the KVs ( which the underlying > scanner passed) will flow via ur new wrapper scanner where you can > avoid certain KVs based on your condition or logic > > Core WrapperScannerImpl Actual > Scanner (created by core) > -> next(List<KeyValue>) -> > next(List<KeyValue>) > <- > Do the real scan from HFiles > See List KVs and remove > those u dont want > <- > Only the passed > KVs come in final > merged file > > Hope I make it clear for you :) > > Note : - preCompactScannerOpen() will be called before even creating > the actual scanner while preCompact() after this scanner creation.. You > can see the code in Store#compactStore() > > -Anoop- > ________________________________________ > From: yun peng [[EMAIL PROTECTED]] > Sent: Wednesday, October 17, 2012 9:04 PM > To: [EMAIL PROTECTED] > Subject: Re: Where is code in hbase that physically delete a record? > > Hi, Ram and Anoop, Thanks for the nice reference on the java file, > which I > will check through. > > It is interesting to know about the recent feature on > preCompactScannerOpen() hook. Ram, it would be nice if I can know how > to > specify conditions like c1 = 'a'. I have also checked the example code > in > hbase 6496 link <https://issues.apache.org/jira/browse/HBASE-6496>. > which > show how to delete data before time as in a on-demand specification... > Cheers, > Yun > > On Wed, Oct 17, 2012 at 8:46 AM, Ramkrishna.S.Vasudevan < > [EMAIL PROTECTED]> wrote: > > > Also to see the code how the delete happens pls refer to > StoreScanner.java > > and how the ScanQueryMatcher.match() works. > > > > That is where we decide if any kv has to be avoided due to already > deleted > > tombstone marker. > > > > Forgot to tell you about this. > > > > Regards > > Ram > > > > > -----Original Message----- > > > From: yun peng [mailto:[EMAIL PROTECTED]]
-
Re: Where is code in hbase that physically delete a record?PG 2012-10-19, 19:41
Hi, Anoop and Ram,
As I have coded the idea, the detailed instructions are very helpful. One minor thing to add is that coming out from scanner are the KeyValues which are already sorted by column qualifier and time stamps. though i did not find it mentioned in java doc, but i found it very useful feature to do filtering. Thanks. Yun On Oct 18, 2012, at 12:20 AM, "Ramkrishna.S.Vasudevan" <[EMAIL PROTECTED]> wrote: > Hi Yun > > Hope Anoop's clear explanation will help you. > Just to add on, after you wrap the StoreScanner in your Custome Scanner Impl > you will invoke the next(List<KeyValue>) on the delegator(here the delegator > is the actual StoreScanner). > The delegator will give you the KV list that it has fetched from underlying > Scanners (Memstore and StoreFileScanner). > Now on the returned kv you can do a check say if the KV has a column C1 and > its value is 'a', just skip it so that this scanner does not send the kv to > the actual Scanner on the outside of the custom Scanner which the compaction > tries to use. > > The Code may look lik this > Class CustomScanner implements InternalScanner{ > StoreScanner delegate; > Public CustomScanner(){ > Delegate = new SToreScanner(); > > Public boolean next(List<KeyValue>kv) > { > delegate.next(kv); > foreach(kv){ > //Do necessary filtering here. > } > > } > } > > Regards > Ram > >> -----Original Message----- >> From: Anoop Sam John [mailto:[EMAIL PROTECTED]] >> Sent: Thursday, October 18, 2012 9:02 AM >> To: [EMAIL PROTECTED] >> Subject: RE: Where is code in hbase that physically delete a record? >> >> Hi Yun, >> We have preCompactScannerOpen() and preCompact() hooks.. >> As we said, for compaction, a scanner for reading all corresponding >> HFiles ( all HFiles in major compaction) will be created and scan via >> that scanner.. ( calling next() methods).. The kernel will do this >> way.. >> Now using these hooks you can create a wrapper over the actual >> scanner... In fact you can use preCompact() hook(I think that is fine >> for you).. By the time this is being called, the actual scanner is >> made and will get that object passed to your hook... You can create a >> custom scanner impl and wrap the actual scanner within that and return >> the new wrapper scanner from your post hook.. [Yes its return type is >> InternalScanner] The actual scanner you can use as a delegator to do >> the actual scanning purpose... Now all the KVs ( which the underlying >> scanner passed) will flow via ur new wrapper scanner where you can >> avoid certain KVs based on your condition or logic >> >> Core WrapperScannerImpl Actual >> Scanner (created by core) >> -> next(List<KeyValue>) -> >> next(List<KeyValue>) >> <- >> Do the real scan from HFiles >> See List KVs and remove >> those u dont want >> <- >> Only the passed >> KVs come in final >> merged file >> >> Hope I make it clear for you :) >> >> Note : - preCompactScannerOpen() will be called before even creating >> the actual scanner while preCompact() after this scanner creation.. You >> can see the code in Store#compactStore() >> >> -Anoop- >> ________________________________________ >> From: yun peng [[EMAIL PROTECTED]] >> Sent: Wednesday, October 17, 2012 9:04 PM >> To: [EMAIL PROTECTED] >> Subject: Re: Where is code in hbase that physically delete a record? >> >> Hi, Ram and Anoop, Thanks for the nice reference on the java file, >> which I >> will check through. >> >> It is interesting to know about the recent feature on >> preCompactScannerOpen() hook. Ram, it would be nice if I can know how >> to >> specify conditions like c1 = 'a'. I have also checked the example code >> in >> hbase 6496 link <https://issues.apache.org/jira/browse/HBASE-6496>. >> which
-
Re: Where is code in hbase that physically delete a record?Anoop John 2012-10-20, 03:08
Yes the KVs coming out from your delegate Scanner will be in sorted form..
Also with all other logic applied like removing TTL expired data, handling max versions etc.. Thanks for updating.. -Anoop- On Sat, Oct 20, 2012 at 1:11 AM, PG <[EMAIL PROTECTED]> wrote: > Hi, Anoop and Ram, > As I have coded the idea, the detailed instructions are very helpful. One > minor thing to add is that coming out from scanner are the KeyValues which > are already sorted by column qualifier and time stamps. though i did not > find it mentioned in java doc, but i found it very useful feature to do > filtering. > > Thanks. > Yun > > On Oct 18, 2012, at 12:20 AM, "Ramkrishna.S.Vasudevan" < > [EMAIL PROTECTED]> wrote: > > > Hi Yun > > > > Hope Anoop's clear explanation will help you. > > Just to add on, after you wrap the StoreScanner in your Custome Scanner > Impl > > you will invoke the next(List<KeyValue>) on the delegator(here the > delegator > > is the actual StoreScanner). > > The delegator will give you the KV list that it has fetched from > underlying > > Scanners (Memstore and StoreFileScanner). > > Now on the returned kv you can do a check say if the KV has a column C1 > and > > its value is 'a', just skip it so that this scanner does not send the kv > to > > the actual Scanner on the outside of the custom Scanner which the > compaction > > tries to use. > > > > The Code may look lik this > > Class CustomScanner implements InternalScanner{ > > StoreScanner delegate; > > Public CustomScanner(){ > > Delegate = new SToreScanner(); > > > > Public boolean next(List<KeyValue>kv) > > { > > delegate.next(kv); > > foreach(kv){ > > //Do necessary filtering here. > > } > > > > } > > } > > > > Regards > > Ram > > > >> -----Original Message----- > >> From: Anoop Sam John [mailto:[EMAIL PROTECTED]] > >> Sent: Thursday, October 18, 2012 9:02 AM > >> To: [EMAIL PROTECTED] > >> Subject: RE: Where is code in hbase that physically delete a record? > >> > >> Hi Yun, > >> We have preCompactScannerOpen() and preCompact() hooks.. > >> As we said, for compaction, a scanner for reading all corresponding > >> HFiles ( all HFiles in major compaction) will be created and scan via > >> that scanner.. ( calling next() methods).. The kernel will do this > >> way.. > >> Now using these hooks you can create a wrapper over the actual > >> scanner... In fact you can use preCompact() hook(I think that is fine > >> for you).. By the time this is being called, the actual scanner is > >> made and will get that object passed to your hook... You can create a > >> custom scanner impl and wrap the actual scanner within that and return > >> the new wrapper scanner from your post hook.. [Yes its return type is > >> InternalScanner] The actual scanner you can use as a delegator to do > >> the actual scanning purpose... Now all the KVs ( which the underlying > >> scanner passed) will flow via ur new wrapper scanner where you can > >> avoid certain KVs based on your condition or logic > >> > >> Core WrapperScannerImpl Actual > >> Scanner (created by core) > >> -> next(List<KeyValue>) -> > >> next(List<KeyValue>) > >> <- > >> Do the real scan from HFiles > >> See List KVs and remove > >> those u dont want > >> <- > >> Only the passed > >> KVs come in final > >> merged file > >> > >> Hope I make it clear for you :) > >> > >> Note : - preCompactScannerOpen() will be called before even creating > >> the actual scanner while preCompact() after this scanner creation.. You > >> can see the code in Store#compactStore() > >> > >> -Anoop- > >> ________________________________________ > >> From: yun peng [[EMAIL PROTECTED]] > >> Sent: Wednesday, October 17, 2012 9:04 PM > >> To: [EMAIL PROTECTED] > >> Subject: Re: Where is code in hbase that physically delete a record?
-
Re: Where is code in hbase that physically delete a record?ramkrishna vasudevan 2012-10-20, 05:50
Hi
Always any KV that comes from scanning are sorted lexographically and the recent timestamps will come out first. So even if your data writes col qualifier c2 first and then c1 because of lexographical ordering c1 will be coming first. Also recent versions of a row will be coming out first and the no of versions can be user controlled. Have fun !!! Regards Ram On Sat, Oct 20, 2012 at 1:11 AM, PG <[EMAIL PROTECTED]> wrote: > Hi, Anoop and Ram, > As I have coded the idea, the detailed instructions are very helpful. One > minor thing to add is that coming out from scanner are the KeyValues which > are already sorted by column qualifier and time stamps. though i did not > find it mentioned in java doc, but i found it very useful feature to do > filtering. > > Thanks. > Yun > > On Oct 18, 2012, at 12:20 AM, "Ramkrishna.S.Vasudevan" < > [EMAIL PROTECTED]> wrote: > > > Hi Yun > > > > Hope Anoop's clear explanation will help you. > > Just to add on, after you wrap the StoreScanner in your Custome Scanner > Impl > > you will invoke the next(List<KeyValue>) on the delegator(here the > delegator > > is the actual StoreScanner). > > The delegator will give you the KV list that it has fetched from > underlying > > Scanners (Memstore and StoreFileScanner). > > Now on the returned kv you can do a check say if the KV has a column C1 > and > > its value is 'a', just skip it so that this scanner does not send the kv > to > > the actual Scanner on the outside of the custom Scanner which the > compaction > > tries to use. > > > > The Code may look lik this > > Class CustomScanner implements InternalScanner{ > > StoreScanner delegate; > > Public CustomScanner(){ > > Delegate = new SToreScanner(); > > > > Public boolean next(List<KeyValue>kv) > > { > > delegate.next(kv); > > foreach(kv){ > > //Do necessary filtering here. > > } > > > > } > > } > > > > Regards > > Ram > > > >> -----Original Message----- > >> From: Anoop Sam John [mailto:[EMAIL PROTECTED]] > >> Sent: Thursday, October 18, 2012 9:02 AM > >> To: [EMAIL PROTECTED] > >> Subject: RE: Where is code in hbase that physically delete a record? > >> > >> Hi Yun, > >> We have preCompactScannerOpen() and preCompact() hooks.. > >> As we said, for compaction, a scanner for reading all corresponding > >> HFiles ( all HFiles in major compaction) will be created and scan via > >> that scanner.. ( calling next() methods).. The kernel will do this > >> way.. > >> Now using these hooks you can create a wrapper over the actual > >> scanner... In fact you can use preCompact() hook(I think that is fine > >> for you).. By the time this is being called, the actual scanner is > >> made and will get that object passed to your hook... You can create a > >> custom scanner impl and wrap the actual scanner within that and return > >> the new wrapper scanner from your post hook.. [Yes its return type is > >> InternalScanner] The actual scanner you can use as a delegator to do > >> the actual scanning purpose... Now all the KVs ( which the underlying > >> scanner passed) will flow via ur new wrapper scanner where you can > >> avoid certain KVs based on your condition or logic > >> > >> Core WrapperScannerImpl Actual > >> Scanner (created by core) > >> -> next(List<KeyValue>) -> > >> next(List<KeyValue>) > >> <- > >> Do the real scan from HFiles > >> See List KVs and remove > >> those u dont want > >> <- > >> Only the passed > >> KVs come in final > >> merged file > >> > >> Hope I make it clear for you :) > >> > >> Note : - preCompactScannerOpen() will be called before even creating > >> the actual scanner while preCompact() after this scanner creation.. You > >> can see the code in Store#compactStore() > >> > >> -Anoop- > >> ________________________________________ |