Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Grouping in Combiners


Copy link to this message
-
Re: Grouping in Combiners
Not one I can think of at the moment, but intuitively that's the kind of
flexibility I would like to see should a grouping comparator become
configurable for Combiners.

By pushing your reasoning further, why specify a combiner class at all
instead of applying the Reducer on the map side.
On Oct 31, 2011 10:02 PM, "Shevek" <[EMAIL PROTECTED]> wrote:

> On 31 October 2011 13:37, Mathias Herberts <[EMAIL PROTECTED]
> >wrote:
>
> > I don't know if it's a bug but I'd rather have the ability to set a
> > Combiner specific group comparator than to have the Combiner use the
> group
> > comparator set for the Reducer.
> > On Oct 31, 2011 9:21 PM, "Harsh J" <[EMAIL PROTECTED]> wrote:
> >
>
> Now I'm curious. Can you argue that there's a case where it makes a
> difference? Preferably one where it can't be trivially curried into the
> combiner?
>
> S.
>
>
> > > Shevek,
> > >
> > > The problem Mathias indicates here is that the Combiners do not utilize
> > > the Grouping Comparators. They only use the Sort Comparators. Is that
> > > probably a bug is what I wonder.
> > >
> > > On 31-Oct-2011, at 11:14 PM, Shevek wrote:
> > >
> > > > I like the ability to reuse a Java component for both sorting and
> > > grouping,
> > > > and to be honest, since the cases where one can do a comparison
> without
> > > > deserializing the raw bytes are relatively few and far between, I
> tend
> > to
> > > > use java's Comparator interface, and wrap it in some
> > > > infrastructure-specific adapter. I have a vague feeling that Hadoop
> > > > sometimes calls the byte interface and sometimes the object interface
> > > > anyway? ICBW, the way I've been writing code makes it irrelevant.
> > > >
> > > > Alternatively, I've misunderstood the (simpler) question, and the
> > answer
> > > is
> > > > to use the setGroupingComparatorClass() API.
> > > >
> > > > S.
> > > >
> > > > On 29 October 2011 04:35, Mathias Herberts <
> [EMAIL PROTECTED]
> > > >wrote:
> > > >
> > > >> Another point concerning the Combiners,
> > > >>
> > > >> the grouping is currently done using the RawComparator used for
> > > >> sorting the Mapper's output. Wouldn't it be useful to be able to
> set a
> > > >> custom CombinerGroupingComparatorClass?
> > > >>
> > > >> Mathias.
> > > >>
> > >
> > >
> >
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB