I agree that the behaviour shouldn't be dynamically changed at runtime with
regard to the class being use as a Combiner or a Reducer but someone may
want to produce counters in order to have an overview of what is happening
(sanity check). But you really would like to be able to not aggregate the
same counters between the Combiner and the Reducer. How would someone do
that? ie you can introduce a combine/reduce keyword in the counters name
but how would you detect which instantiation is used in which case? I guess
somehow with the task name it might be possible.. Is there a better way?
BUT if you look at the jobtracker counters summary there is a distinction
between map and reduce values. Maybe it is enough in this case? (I have
never used counters inside a combiner so I don't know.)
On Tue, Nov 6, 2012 at 12:29 PM, Harsh J <[EMAIL PROTECTED]> wrote:
> Hi Prasad,
> My reply inline.
> On Tue, Nov 6, 2012 at 4:15 PM, Prasad GS <[EMAIL PROTECTED]> wrote:
> > Hi,
> > I'm setting my combiner and reducer to the same java class. Is there any
> > that could tell me the context in which the java class is running after
> > hadoop job is submitted to the cluster i.e whether the class is running
> as a
> > combiner or a reducer.
> A combiner may run both at the map end and at the reduce end. Even if
> it is possible to do it, it isn't a healthy idea to have the method's
> logic detect if its running as a reducer or as a combiner.
> > I need this information to change the OutputCollector
> > in the java class. Also I do not want to duplicate the same code as
> > and reducer with only the OutputCollector changed.
> Why do you think it would require duplication? Your logic can be built
> in smaller, independent, reusable functions within the same class, and
> just applied differently for an implementation of Reducer class and an
> implementation of the Combiner class. This way, you repeat nothing.
> > Thanks,
> > Prasad
> Harsh J