Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> Partitioner vs GroupComparator


+
Eugene Morozov 2013-08-23, 15:19
Copy link to this message
-
Re: Partitioner vs GroupComparator
The partitioner runs on the map-end. It assigns a partition ID
(reducer ID) to each key.
The grouping comparator runs on the reduce-end. It helps reducers,
which read off a merge-sorted single file, to understand how to break
the sequential file into reduce calls of <key, values[]>.

Typically one never overrides the GroupingComparator, and it is
usually the same as the SortComparator. But if you wish to do things
such as Secondary Sort, then overriding this comes useful - cause you
may want to sort over two parts of a key object, but only group by one
part, etc..

On Fri, Aug 23, 2013 at 8:49 PM, Eugene Morozov
<[EMAIL PROTECTED]> wrote:
> Hello,
>
> I have two different types of keys emerged from Map and processed by Reduce.
> These keys have some part in common. And I'd like to have similar keys in
> one reducer. For that purpose I used Partitioner and partition everything
> gets in by this common part. It seems to be fine, but MRUnit seems doesn't
> know anything about Partitioners. So, here is where GroupComparator comes
> into play. It seems that MRUnit well aware of the guy, but it surprises me:
> it looks like Partitioner and GroupComparator are actually doing exactly
> same - they both somehow group keys to have them in one reducer.
> Could you shed some light on it, please.
> --
>

--
Harsh J
+
Lukavsky, Jan 2013-08-23, 19:13
+
java8964 java8964 2013-08-23, 17:45