Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Grouping in Combiners


Copy link to this message
-
Re: Grouping in Combiners
Not one I can think of at the moment, but intuitively that's the kind of
flexibility I would like to see should a grouping comparator become
configurable for Combiners.

By pushing your reasoning further, why specify a combiner class at all
instead of applying the Reducer on the map side.
On Oct 31, 2011 10:02 PM, "Shevek" <[EMAIL PROTECTED]> wrote:

> On 31 October 2011 13:37, Mathias Herberts <[EMAIL PROTECTED]
> >wrote:
>
> > I don't know if it's a bug but I'd rather have the ability to set a
> > Combiner specific group comparator than to have the Combiner use the
> group
> > comparator set for the Reducer.
> > On Oct 31, 2011 9:21 PM, "Harsh J" <[EMAIL PROTECTED]> wrote:
> >
>
> Now I'm curious. Can you argue that there's a case where it makes a
> difference? Preferably one where it can't be trivially curried into the
> combiner?
>
> S.
>
>
> > > Shevek,
> > >
> > > The problem Mathias indicates here is that the Combiners do not utilize
> > > the Grouping Comparators. They only use the Sort Comparators. Is that
> > > probably a bug is what I wonder.
> > >
> > > On 31-Oct-2011, at 11:14 PM, Shevek wrote:
> > >
> > > > I like the ability to reuse a Java component for both sorting and
> > > grouping,
> > > > and to be honest, since the cases where one can do a comparison
> without
> > > > deserializing the raw bytes are relatively few and far between, I
> tend
> > to
> > > > use java's Comparator interface, and wrap it in some
> > > > infrastructure-specific adapter. I have a vague feeling that Hadoop
> > > > sometimes calls the byte interface and sometimes the object interface
> > > > anyway? ICBW, the way I've been writing code makes it irrelevant.
> > > >
> > > > Alternatively, I've misunderstood the (simpler) question, and the
> > answer
> > > is
> > > > to use the setGroupingComparatorClass() API.
> > > >
> > > > S.
> > > >
> > > > On 29 October 2011 04:35, Mathias Herberts <
> [EMAIL PROTECTED]
> > > >wrote:
> > > >
> > > >> Another point concerning the Combiners,
> > > >>
> > > >> the grouping is currently done using the RawComparator used for
> > > >> sorting the Mapper's output. Wouldn't it be useful to be able to
> set a
> > > >> custom CombinerGroupingComparatorClass?
> > > >>
> > > >> Mathias.
> > > >>
> > >
> > >
> >
>