apatro 2012-04-23, 08:34
Counter question: why do you want to run M/R jobs to do aggregation? You
could do this insitu with a custom aggregation coprocessor. Essentially,
you would set a time span over which you would aggregate a row (or possibly
multiple rows, but then you have to be sure that they are on the same
region, which means using a custom split policy or pre-splitting and
turning splitting off all together). If you apply the CP at scan, flush and
compaction you should get the same behavior without all the messy IO. We
don't really have a good guide for how to do this kind of thing, but the
concept here is similar to what Accumulo does with
But to answer your original question, I use anything else than cron for
that kind of stuff (that's what its there for :).
On Mon, Apr 23, 2012 at 1:34 AM, apatro <[EMAIL PROTECTED]> wrote:
> I'd like to know if there is some alternative to using crons while
> scheduling Map Reduce jobs wherein one can incorporate one's own scheduling
> logic. For instance, to perform aggregation on table data on a particular
> hour of the day or a particular day in a week and the sorts.
> Thanks in advance :)
> Arati Patro
> View this message in context:
> Sent from the HBase - Developer mailing list archive at Nabble.com.