Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> YARN Pi example job stuck at 0%(No MR tasks are started by ResourceManager)


Copy link to this message
-
Re: YARN Pi example job stuck at 0%(No MR tasks are started by ResourceManager)
Hi Harsh,

Thanks a lot for your response. I am going to try your suggestions and let
you know the outcome.
I am running the cluster on VMWare hypervisor. I have 3 physical machines
with 16GB of RAM, and 4TB( 2 HD of 2TB each). On every machine i am running
4 VM's. Each VM is having 3.2 GB of memory. I built this cluster for trying
out HA(NN, ZK, HMaster) since we are little reluctant to deploy anything
without HA in prod.
This cluster is supposed to be used as HBase cluster and MR is going to be
used only for Bulk Loading. Also, my data dump is around 10 GB(which is
pretty small for Hadoop). I am going to load this data in 4 different
schema which will be roughly 150 million records for HBase.
So, i think i will lower down the memory requirement of Yarn for my use
case rather than reducing the number of data nodes to increase the memory
of remaining Data Nodes. Do you think this will be the right approach for
my cluster environment?
Also, on a side note, shouldn't the NodeManager throw an error on this kind
of memory problem? Should i file a JIRA for this? It just sat quietly over
there.

Thanks a lot,
Anil Gupta

On Fri, Jul 27, 2012 at 3:36 PM, Harsh J <[EMAIL PROTECTED]> wrote:

> Hi,
>
> The 'root' doesn't matter. You may run jobs as any username on an
> unsecured cluster, should be just the same.
>
> The config yarn.nodemanager.resource.memory-mb = 1200 is your issue.
> By default, the tasks will execute with a resource demand of 1 GB, and
> the AM itself demands, by default, 1.5 GB to run. None of your nodes
> are hence able to start your AM (demand=1500mb) and hence if the AM
> doesn't start, your job won't initiate either.
>
> You can do a few things:
>
> 1. Raise yarn.nodemanager.resource.memory-mb to a value close to 4 GB
> perhaps, if you have the RAM? Think of it as the new 'slots' divider.
> The larger the offering (close to total RAM you can offer for
> containers from the machine), the more the tasks that may run on it
> (depending on their own demand, of course). Reboot the NM's one by one
> and this app will begin to execute.
> 2. Lower the AM's requirement, i.e. lower
> yarn.app.mapreduce.am.resource.mb in your client's mapred-site.xml or
> job config from 1500 to 1000 or less, so it fits in the NM's offering.
> Likewise, control the map and reduce's requests via
> mapreduce.map.memory.mb and mapreduce.reduce.memory.mb as needed.
> Resubmit the job with these lowered requirements and things should now
> work.
>
> Optionally, you may also cap the max/min possible requests via
> "yarn.scheduler.minimum-allocation-mb" and
> "yarn.scheduler.maximum-allocation-mb", such that no app/job ends up
> demanding more than a certain limit and hence run into the
> 'forever-waiting' state as in your case.
>
> Hope this helps! For some communication diagrams on how an app (such
> as MR2, etc.) may work on YARN and how the resource negotiation works,
> you can check out this post from Ahmed at
> http://www.cloudera.com/blog/2012/02/mapreduce-2-0-in-hadoop-0-23/
>
> On Sat, Jul 28, 2012 at 3:35 AM, anil gupta <[EMAIL PROTECTED]> wrote:
> > Hi Harsh,
> >
> > I have set the *yarn.nodemanager.resource.memory-mb *to 1200 mb. Also,
> does
> > it matters if i run the jobs as "root" while the RM service and NM
> service
> > are running as "yarn" user? However, i have created the /user/root
> > directory for root user in hdfs.
> >
> > Here is the yarn-site.xml:
> > <configuration>
> >   <property>
> >     <name>yarn.nodemanager.aux-services</name>
> >     <value>mapreduce.shuffle</value>
> >   </property>
> >
> >   <property>
> >     <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
> >     <value>org.apache.hadoop.mapred.ShuffleHandler</value>
> >   </property>
> >
> >   <property>
> >     <name>yarn.log-aggregation-enable</name>
> >     <value>true</value>
> >   </property>
> >
> >   <property>
> >     <description>List of directories to store localized files
> > in.</description>
> >     <name>yarn.nodemanager.local-dirs</name>

Thanks & Regards,
Anil Gupta