Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce, mail # user - Questions with regard to scheduling of map and reduce tasks


+
Vasco Visser 2012-08-30, 17:41
+
Vinod Kumar Vavilapalli 2012-08-30, 18:19
+
Vasco Visser 2012-08-30, 23:38
+
祝美祺 2012-08-31, 02:07
+
Vinod Kumar Vavilapalli 2012-08-31, 03:51
+
Vasco Visser 2012-08-31, 11:17
Copy link to this message
-
Re: Questions with regard to scheduling of map and reduce tasks
Vinod Kumar Vavilapalli 2012-08-31, 22:59

You don't need to touch the code related protocol-buffer records at all as there are java-native interfaces for everything, for e.g. - org.apache.hadoop.yarn.api.AMRMProtocol.

Regarding your question - The JobClient first obtains the locations of DFS blocks via the InputFormat.getSplits() and uploads the accumulated information into a split file, see Job.submitInternal() ->  JobSubmitter.writeSplits() ->  ...

The MR AM then downloads and reads the split file and reconstructs the splits-information and creates TaskAttempts(TAs) which then use it request containers. See MRAppMaster code: JobImpl.InitTransition for how TAs are created with host information.

HTH,

+Vinod Kumar Vavilapalli
Hortonworks Inc.
http://hortonworks.com/

On Aug 31, 2012, at 4:17 AM, Vasco Visser wrote:

> Thanks again for the reply, it is becoming clear.
>
> While on the subject of going over the code, do you know by any chance
> where the piece of code is that creates resource requests according to
> locations of HDFS blocks? I am looking for that, but the protocol
> buffer stuff makes it difficult for me to understand what is going on.
>
> regards, Vasco
>
+
Vasco Visser 2012-09-02, 16:41