I have a custom co-processor endpoint that handles aggregation of
various statistics for each region (the stats from all regions are
then merged together for the final result). Sometimes the amount of
data to aggregate is very large, and it takes longer than the exec
timeout to completely aggregate the region. Under this scenario, the
client then compounds the problem by initiating up to 10 retries.
I haven't been able to find any supported APIs for getting around
this, so I intend to modify my co-processor to stop itself after N
seconds and include in its result the row key where it should resume.
I can repeatedly invoke HTable.coprocessorExec until all of the
regions report that they've finished their aggregations, but each
subsequent call to HTable.coprocessorExec will hit all regions, even
if they've completed their work.
The only way I can see to efficiently invoke my co-processor on only
the servers with work remaining is to write my own code to manage the
co-processor proxy objects. I haven't found any documentation that
details the thread-safety of each proxy instance, or information about
which thread pool is used.
Can anyone shed some light on this strategy? Perhaps you've
encountered the same issue; How did you solve it?
Thanks in advance!