Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # dev >> Question about intermediate kv pair files


+
rshepherd 2012-12-03, 18:08
+
Mostafa Elhemali 2012-12-03, 18:26
Copy link to this message
-
Re: Question about intermediate kv pair files
Thanks Mostafa! Very much appreciated.

On 12/3/12 1:26 PM, Mostafa Elhemali wrote:
> (Disclaimer: Not an expert, but looked at that code quite a bit. Hopefully
> the list will correct any details I get wrong)
>
> In Hadoop 1: the mapper would put the file in a well-known location on the
> machine (encoded by user, job ID and map ID) then TaskTracker would serve
> it over HTTP to the reducer when it requests it (authenticated using a
> secret token in the job). Look in the MapOutputServlet class in TaskTracker
> for most of the related code.
>
> In Yarn: similar thing, except that now it's a NodeManager plug-in
> (auxiliary service) that serves the map output since there's no TaskTracker
> anymore. Look at the ShuffleHandler class in
> hadoop-mapreduce-client-shuffle project. I see comments in the code
> indicating that this will be changed from a NodeManager plug-in in the
> future, but I don't know much about that.
>
> Hope it helps,
> Mostafa
>
>
> On Mon, Dec 3, 2012 at 10:08 AM, rshepherd <[EMAIL PROTECTED]> wrote:
>
>> Hi folks,
>>
>> Can anyone explain to me briefly how the each mapper reports the
>> location of the intermediate kv partion files to the master? And, if
>> possible, where in the code I might find where that happens?
>>
>> Thanks for any help,
>> Randy
>>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB