Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HDFS >> mail # user >> RE: How Yarn execute MRv1 job?


+
Devaraj k 2013-06-19, 05:35
+
Rahul Bhattacharjee 2013-06-19, 05:41
+
Arun C Murthy 2013-06-19, 05:54
+
Rahul Bhattacharjee 2013-06-19, 10:20
+
sam liu 2013-06-20, 01:45
+
Arun C Murthy 2013-06-20, 04:12
+
sam liu 2013-06-20, 06:11
+
Azuryy Yu 2013-06-20, 06:33
+
sam liu 2013-06-20, 06:56
+
Arun C Murthy 2013-06-20, 06:59
+
Azuryy Yu 2013-06-20, 07:17
+
sam liu 2013-06-20, 08:48
Copy link to this message
-
Re: How Yarn execute MRv1 job?
by please correct , i meant  - please correct me if my statement is wrong.
On Wed, Jun 19, 2013 at 11:11 AM, Rahul Bhattacharjee <
[EMAIL PROTECTED]> wrote:

> Hi Devaraj,
>
> As for the container request request for yarn container , currently only
> memory is considered as resource , not cpu. Please correct.
>
> Thanks,
> Rahul
>
>
> On Wed, Jun 19, 2013 at 11:05 AM, Devaraj k <[EMAIL PROTECTED]> wrote:
>
>>  Hi Sam,****
>>
>>   Please find the answers for your queries. ****
>>
>>
>> >- Yarn could run multiple kinds of jobs(MR, MPI, ...), but, MRv1 job has
>> special execution process(map > shuffle > reduce) in Hadoop 1.x, and how
>> Yarn execute a MRv1 job? still include some special MR steps in Hadoop 1.x,
>> like map, sort, merge, combine and shuffle?****
>>
>> ** **
>>
>> In Yarn, it is a concept of application. MR Job is one kind of
>> application which makes use of MRAppMaster(i.e ApplicationMaster for the
>> application). If we want to run different kinds of applications we should
>> have ApplicationMaster for each kind of application.****
>>
>> ** **
>>
>> >- Do the MRv1 parameters still work for Yarn? Like
>> mapreduce.task.io.sort.mb and mapreduce.map.sort.spill.percent?****
>>
>> These configurations still work for MR Job in Yarn.****
>>
>>
>> >- What's the general process for ApplicationMaster of Yarn to execute a
>> job?****
>>
>> MRAppMaster(Application Master for MR Job) does the Job life cycle which
>> includes getting the containers for maps & reducers, launch the containers
>> using NM, tacks the tasks status till completion, manage the failed tasks.
>> ****
>>
>>
>> >2. In Hadoop 1.x, we can set the map/reduce slots by setting
>> 'mapred.tasktracker.map.tasks.maximum' and
>> 'mapred.tasktracker.reduce.tasks.maximum'
>> >- For Yarn, above tow parameter do not work any more, as yarn uses
>> container instead, right?****
>>
>> Correct, these params don’t work in yarn. In Yarn it is completely based
>> on the resources(memory, cpu). Application Master can request the RM for
>> resources to complete the tasks for that application.****
>>
>>
>> >- For Yarn, we can set the whole physical mem for a NodeManager using
>> 'yarn.nodemanager.resource.memory-mb'. But how to set the default size of
>> physical mem of a container?****
>>
>> ApplicationMaster is responsible for getting the containers from RM by
>> sending the resource requests. For MR Job, you can use
>> "mapreduce.map.memory.mb" and “mapreduce.reduce.memory.mb" configurations
>> for specifying the map & reduce container memory sizes.****
>>
>> ** **
>>
>> >- How to set the maximum size of physical mem of a container? By the
>> parameter of 'mapred.child.java.opts'?****
>>
>> It can be set based on the resources requested for that container.****
>>
>> ** **
>>
>> ** **
>>
>> Thanks****
>>
>> Devaraj K****
>>
>> *From:* sam liu [mailto:[EMAIL PROTECTED]]
>> *Sent:* 19 June 2013 08:16
>> *To:* [EMAIL PROTECTED]
>> *Subject:* How Yarn execute MRv1 job?****
>>
>> ** **
>>
>> Hi,
>>
>> 1.In Hadoop 1.x, a job will be executed by map task and reduce task
>> together, with a typical process(map > shuffle > reduce). In Yarn, as I
>> know, a MRv1 job will be executed only by ApplicationMaster.
>> - Yarn could run multiple kinds of jobs(MR, MPI, ...), but, MRv1 job has
>> special execution process(map > shuffle > reduce) in Hadoop 1.x, and how
>> Yarn execute a MRv1 job? still include some special MR steps in Hadoop 1.x,
>> like map, sort, merge, combine and shuffle?
>> - Do the MRv1 parameters still work for Yarn? Like
>> mapreduce.task.io.sort.mb and mapreduce.map.sort.spill.percent?
>> - What's the general process for ApplicationMaster of Yarn to execute a
>> job?
>>
>> 2. In Hadoop 1.x, we can set the map/reduce slots by setting
>> 'mapred.tasktracker.map.tasks.maximum' and
>> 'mapred.tasktracker.reduce.tasks.maximum'
>> - For Yarn, above tow parameter do not work any more, as yarn uses
>> container instead, right?
>> - For Yarn, we can set the whole physical mem for a NodeManager using