Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Filtering/Collection columns during Major Compaction


Copy link to this message
-
Re: Filtering/Collection columns during Major Compaction
Hi Lars,

Thanks for the detailed tip - we will go down that path. Looking at the
javadoc for InternalScanner.next() - it says grab the next row's values -
is this rows in the hbase sense or are these rows in the HFile - I suspect
it is the latter ?

Thanks !

On Mon, Dec 10, 2012 at 11:19 PM, lars hofhansl <[EMAIL PROTECTED]> wrote:

> Filters do not work for compactions. We only support them for user scans.
> (some of them might incidentally work, but that is entirely untested and
> unsupported)
>
> You best bet is to use the preCompact hook and return a wrapper scanner
> like so:
>
>     public InternalScanner
> preCompact(ObserverContext<RegionCoprocessorEnvironment> e,
>         Store store, final InternalScanner scanner) {
>       return new InternalScanner() {
>         public boolean next(List<KeyValue> results) throws IOException {
>           return next(results, -1);
>         }
>         public boolean next(List<KeyValue> results, String metric)
>             throws IOException {
>           return next(results, -1, metric);
>         }
>         public boolean next(List<KeyValue> results, int limit)
>             throws IOException{
>           return next(results, limit, null);
>         }
>         public boolean next(List<KeyValue> results, int limit, String
> metric)
>             throws IOException {
>
>             // call next on the passed scanner
>             // do your filtering here
>         }
>
>         public void close() throws IOException {
>           scanner.close();
>         }
>       };
>     }
>
> -- Lars
>
>
>
> ________________________________
>  From: Varun Sharma <[EMAIL PROTECTED]>
> To: [EMAIL PROTECTED]; lars hofhansl <[EMAIL PROTECTED]>
> Sent: Monday, December 10, 2012 11:04 PM
> Subject: Re: Filtering/Collection columns during Major Compaction
>
> Hi Lars,
>
> In my case, I just want to use ColumnPaginationFilter() rather than
> implementing my own logic for filter. Is there an easy way to apply this
> filter on top of an existing scanner ? Do I do something like
>
> RegionScannerImpl scanner = new RegionScannerImpl(scan_with_my_filter,
> original_compaction_scanner)
>
> Thanks
> Varun
>
> On Mon, Dec 10, 2012 at 9:09 PM, lars hofhansl <[EMAIL PROTECTED]>
> wrote:
>
> > In your case you probably just want to filter on top of the provided
> > scanner with preCompact (rather than actually replacing the scanner,
> which
> > preCompactScannerOpen does).
> >
> > (And sorry I only saw this reply after I sent my own reply to your
> initial
> > question.)
> >
> >
> >
> > ________________________________
> >  From: Varun Sharma <[EMAIL PROTECTED]>
> > To: [EMAIL PROTECTED]
> > Sent: Monday, December 10, 2012 7:29 AM
> > Subject: Re: Filtering/Collection columns during Major Compaction
> >
> > Okay - I looked more thoroughly again - I should be able to extract these
> > from the region observer.
> >
> > Thanks !
> >
> > On Mon, Dec 10, 2012 at 6:59 AM, Varun Sharma <[EMAIL PROTECTED]>
> wrote:
> >
> > > Thanks ! This is exactly what I need. I am looking at the code in
> > > compactStore() under Store.java but I am trying to understand why, for
> > the
> > > real compaction - smallestReadPoint needs to be passed - I thought the
> > read
> > > point was a memstore only thing. Also the preCompactScannerOpen does
> not
> > > have a way of passing this value.
> > >
> > > Varun
> > >
> > >
> > > On Mon, Dec 10, 2012 at 6:08 AM, ramkrishna vasudevan <
> > > [EMAIL PROTECTED]> wrote:
> > >
> > >> Hi Varun
> > >>
> > >> If you are using 0.94 version you have a coprocessor that is getting
> > >> invoked before and after Compaction selection.
> > >> preCompactScannerOpen() helps you to create your own scanner which
> > >> actually
> > >> does the next() operation.
> > >> Now if you can wrap your own scanner and implement your next() it will
> > >> help
> > >> you to play with the kvs that you need.  So basically you can say what
> > >> cols
> > >> to include and what to exclude.
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB