1.In Hadoop 1.x, a job will be executed by map task and reduce task
together, with a typical process(map > shuffle > reduce). In Yarn, as I
know, a MRv1 job will be executed only by ApplicationMaster.
- Yarn could run multiple kinds of jobs(MR, MPI, ...), but, MRv1 job has
special execution process(map > shuffle > reduce) in Hadoop 1.x, and how
Yarn execute a MRv1 job? still include some special MR steps in Hadoop 1.x,
like map, sort, merge, combine and shuffle?
- Do the MRv1 parameters still work for Yarn? Like
mapreduce.task.io.sort.mb and mapreduce.map.sort.spill.percent?
- What's the general process for ApplicationMaster of Yarn to execute a job?
2. In Hadoop 1.x, we can set the map/reduce slots by setting
- For Yarn, above tow parameter do not work any more, as yarn uses
container instead, right?
- For Yarn, we can set the whole physical mem for a NodeManager using
'yarn.nodemanager.resource.memory-mb'. But how to set the default size of
physical mem of a container?
- How to set the maximum size of physical mem of a container? By the
parameter of 'mapred.child.java.opts'?