Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Coprocessors and batch processing

Copy link to this message
Re: Coprocessors and batch processing
Client side batch processing is done at RegionServer level, i.e., all Action
objects are grouped together per RS basis and send in one RPC. Once the
batch arrives at a RS, it gets distributed across corresponding Regions, and
these Action objects are processed, one by one. This include Coprocessor's
Exec objects too.
So, a coprocessor is working at a "Region" level granularity.

If you want to take some action (process bunch of rows of another table from
a CP), one can get a HTable instance from Environment instance of a
Coprocessor, and use the same mechanism as used by the client side.
Will that help in your use-case?

On Wed, Aug 10, 2011 at 11:46 PM, lars hofhansl <[EMAIL PROTECTED]> wrote:

> Here's another coprocessor question...
> From the client we batch operations in order to reduce the number of round
> trips.
> Currently there is no way (that I can find) to make use of those batches in
> coprocessors.
> This is an issue when, for example, sets of puts and gets are (partially)
> forwarded to another table by the coprocessor.
> Right now this would need to use many single puts/deletes/gets from the
> various {pre|post}{put|delete|get} hooks.
> There is no useful demarcation; other than maybe waiting a few miliseconds,
> which is awkward.
> Of course this forwarding could be done directly from the client, put then
> what's the point of coprocessors?
> I guess there could either be a {pre|post}Multi on RegionObserver (although
> HRegionServer.multi does a lot of munging).
> Or maybe a general {pre|post}Request with no arguments - in which case it
> would be at least possible to write code in the coprocessor
> to collect the puts/deletes/etc through the normal single
> prePut/preDelete/etc hooks and then batch-process them in postRequest().
> -- Lars