Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Implementing and running an applicationmaster


Copy link to this message
-
Re: Implementing and running an applicationmaster
Hi

There is a way but it's not an easy one. You should overwrite the container
request code in MR_AM. As each container in MapReduce gets the same amount
of memory, the OOM shouldn't be problem as inner task "buffers" can be
spilled to disk. I am no MapReduce (code) specialist but I would start by
finding MR_Driver.class and MR_AM.class. Then overwrite the Driver.class to
execute your class Custom_MR_AM (C_MR_AM). C_MR_AM will be a copy of MR_AM
but you should change the container request code, so that you can allocate
N containers with X memory and M container with Y memory.

The hadoop-mapreduce-examples.jar is just a bunch of HelloWorld jobs. So a
new user can pick up and "learn" MR quickly.

Maybe some real MR specialist can give you better advice than me.

regards
tmp
2013/12/5 Yue Wang <[EMAIL PROTECTED]>

> Hi,
>
> Thank you for your answer. Now I understand the connection between the two
> ways.
>
> I asked this question because I want to take benefit from the YARN
> architecture.
> If I understood correctly, I can let my ApplicationMaster request
> containers more flexibly. For example, I can request two containers with
> 100MB memory and two containers with 200MB memory for my mappers on YARN.
> However, I cannot do that on MRv1.
>
> So if I execute a WordCount program by typing "yarn jar
> /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar wordcount
> wordcount/ wc-output/", such flexibility is gone.
>
> Is there a way to let my ApplicationMaster execute WordCount on HDFS on
> containers?
>
>
> Thanks!
>
>
> On Thu, Dec 5, 2013 at 4:28 AM, Rob Blah <[EMAIL PROTECTED]> wrote:
>
>> Hi
>>
>> If I understood you correctly, you would like to run your AM with YARN
>> Client from shell as oppose to run the Driver like in MRv1. But it's the
>> same thing (more or less). In the example you provided
>> (org.apache.hadoop.yarn.applications.DistributedShell) the Client.class is
>> the "driver". However since distributed-shell is a "simple" application you
>> do not need a lot of configuration (setting fields in Configuration.class,
>> I/O formats etc.). The same goes for any other application. As for the
>> second example (org.apache.hadoop.examples.WordCount) MapReduce AM requires
>> certain configuration, thus you have to to it the "old-way". The main
>> difference would be: MR -> end-user-config -> driver, DS -> driver (but you
>> still can create your own end-user-config). Hope this answers your question
>> and that I understood it correctly.
>>
>> regards
>> tmp
>>
>>
>> 2013/12/5 Yue Wang <[EMAIL PROTECTED]>
>>
>>> Hi,
>>>
>>> I took a look at the codes and found some examples on the web.
>>> One example is: http://wiki.opf-labs.org/display/SP/Resource+management
>>>
>>> It seems that users can run simple shell commands using Client of YARN.
>>> But when it comes to a practical MapReduce example like WordCount,
>>> people still run commands in the old way as in MRv1.
>>>
>>> How can I run WordCount using Client and ApplicationMaster of YARN so
>>> that I can request resources flexibly?
>>>
>>>
>>> Thanks!
>>>
>>>
>>> On Mon, Dec 2, 2013 at 11:26 AM, Rob Blah <[EMAIL PROTECTED]> wrote:
>>>
>>>> Hi
>>>>
>>>> Follow the example provided in
>>>> Yarn_dist/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell.
>>>>
>>>> regards
>>>> tmp
>>>>
>>>>
>>>> 2013/12/1 Yue Wang <[EMAIL PROTECTED]>
>>>>
>>>>> Hi,
>>>>>
>>>>> I found the page (
>>>>> http://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html)
>>>>> and know how to write an ApplicationMaster.
>>>>>
>>>>> However, is there a complete example showing how to run this
>>>>> ApplicationMaster with a real Hadoop Program (e.g. WordCount) on YARN?
>>>>>
>>>>> Thanks!
>>>>>
>>>>>
>>>>>
>>>>> Yue
>>>>>
>>>>
>>>>
>>>
>>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB