Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop >> mail # user >> Hadoop MapReduce Poster


+
Mathias Herberts 2011-10-31, 13:14
+
Prashant Sharma 2011-11-01, 05:39
+
JAX 2011-11-01, 10:54
+
Rob Marano 2011-10-31, 13:20
+
Mathijs Homminga 2011-10-31, 15:18
Copy link to this message
-
Re: Hadoop MapReduce Poster
On Mon, Oct 31, 2011 at 6:14 AM, Mathias Herberts <
[EMAIL PROTECTED]> wrote:

> Hi,
>
> I'm in the process of putting together a 'Hadoop MapReduce Poster' so
> my students can better understand the various steps of a MapReduce job
> as ran by Hadoop.
Most of it is probably beneath the radar, but if you want the details of
how the sort actually works in MapReduce, I'd suggest going through Chris
Douglas' presentation on it.

http://www.slideshare.net/hadoopusergroup/ordered-record-collection?from=ss_embed

At the very least, you want to show the serialization before the sort in
the Mapper and deserialization in the Reducer, which gives you a good
platform to talk about why you need to define RawComparators for your key
types if you want reasonable performance out of the sort.

 -- Owen
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB