Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop >> mail # user >> Grouping in Combiners


+
Mathias Herberts 2011-10-29, 11:35
+
Shevek 2011-10-31, 17:44
+
Harsh J 2011-10-31, 20:20
+
Shevek 2011-10-31, 20:37
Copy link to this message
-
Re: Grouping in Combiners
I don't know if it's a bug but I'd rather have the ability to set a
Combiner specific group comparator than to have the Combiner use the group
comparator set for the Reducer.
On Oct 31, 2011 9:21 PM, "Harsh J" <[EMAIL PROTECTED]> wrote:

> Shevek,
>
> The problem Mathias indicates here is that the Combiners do not utilize
> the Grouping Comparators. They only use the Sort Comparators. Is that
> probably a bug is what I wonder.
>
> On 31-Oct-2011, at 11:14 PM, Shevek wrote:
>
> > I like the ability to reuse a Java component for both sorting and
> grouping,
> > and to be honest, since the cases where one can do a comparison without
> > deserializing the raw bytes are relatively few and far between, I tend to
> > use java's Comparator interface, and wrap it in some
> > infrastructure-specific adapter. I have a vague feeling that Hadoop
> > sometimes calls the byte interface and sometimes the object interface
> > anyway? ICBW, the way I've been writing code makes it irrelevant.
> >
> > Alternatively, I've misunderstood the (simpler) question, and the answer
> is
> > to use the setGroupingComparatorClass() API.
> >
> > S.
> >
> > On 29 October 2011 04:35, Mathias Herberts <[EMAIL PROTECTED]
> >wrote:
> >
> >> Another point concerning the Combiners,
> >>
> >> the grouping is currently done using the RawComparator used for
> >> sorting the Mapper's output. Wouldn't it be useful to be able to set a
> >> custom CombinerGroupingComparatorClass?
> >>
> >> Mathias.
> >>
>
>
+
Shevek 2011-10-31, 21:02
+
Mathias Herberts 2011-10-31, 21:13
+
Shevek 2011-10-31, 22:15
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB