You understood me perfectly well. I see your first advice, but I am not
allowed to have gaps. A central service is something I may consider if
single reducer becomes a worse bottleneck than it.
But what are counters for? They seem to be exactly that.
On Fri, May 20, 2011 at 12:01 PM, Joey Echeverria <[EMAIL PROTECTED]> wrote:
> To make sure I understand you correctly, you need a globally unique
> one up counter for each output record?
> If you had an upper bound on the number of records a single reducer
> could output and you can afford to have gaps, you could just use the
> task id and multiply that by the max number of records and then one up
> from there.
> If that doesn't work for you, then you'll need to use some kind of
> central service for allocating numbers which could become a
> On Fri, May 20, 2011 at 9:55 AM, Mark Kerzner <[EMAIL PROTECTED]>
> > Hi, can I use a Counter to give each record in all reducers a consecutive
> > number? Currently I am using a single Reducer, but it is an anti-pattern.
> > But I need to assign consecutive numbers to all output records in all
> > reducers, and it does not matter how, as long as each gets its own
> > If it IS possible, then how are multiple processes accessing those
> > without creating race conditions.
> > Thank you,
> > Mark
> Joseph Echeverria
> Cloudera, Inc.