Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Re: Why my tests shows Yarn is worse than MRv1 for terasort?


Copy link to this message
-
Re: Why my tests shows Yarn is worse than MRv1 for terasort?
Hi Harsh,

According to above suggestions, I removed the duplication of setting, and
reduce the value of 'yarn.nodemanager.resource.cpu-cores', '
yarn.nodemanager.vcores-pcores-ratio' and '
yarn.nodemanager.resource.memory-mb' to 16, 8 and 12000. Ant then, the
efficiency improved about 18%.  I have questions:

- How to know the container number? Why you say it will be 22 containers
due to a 22 GB memory?
- My machine has 32 GB memory, how many memory is proper to be assigned to
containers?
- In mapred-site.xml, if I set 'mapreduce.framework.name' to be 'yarn',
will other parameters for mapred-site.xml still work in yarn framework?
Like 'mapreduce.task.io.sort.mb' and 'mapreduce.map.sort.spill.percent'

Thanks!

2013/6/8 Harsh J <[EMAIL PROTECTED]>

> Hey Sam,
>
> Did you get a chance to retry with Sandy's suggestions? The config
> appears to be asking NMs to use roughly 22 total containers (as
> opposed to 12 total tasks in MR1 config) due to a 22 GB memory
> resource. This could impact much, given the CPU is still the same for
> both test runs.
>
> On Fri, Jun 7, 2013 at 12:23 PM, Sandy Ryza <[EMAIL PROTECTED]>
> wrote:
> > Hey Sam,
> >
> > Thanks for sharing your results.  I'm definitely curious about what's
> > causing the difference.
> >
> > A couple observations:
> > It looks like you've got yarn.nodemanager.resource.memory-mb in there
> twice
> > with two different values.
> >
> > Your max JVM memory of 1000 MB is (dangerously?) close to the default
> > mapreduce.map/reduce.memory.mb of 1024 MB. Are any of your tasks getting
> > killed for running over resource limits?
> >
> > -Sandy
> >
> >
> > On Thu, Jun 6, 2013 at 10:21 PM, sam liu <[EMAIL PROTECTED]> wrote:
> >>
> >> The terasort execution log shows that reduce spent about 5.5 mins from
> 33%
> >> to 35% as below.
> >> 13/06/10 08:02:22 INFO mapreduce.Job:  map 100% reduce 31%
> >> 13/06/10 08:02:25 INFO mapreduce.Job:  map 100% reduce 32%
> >> 13/06/10 08:02:46 INFO mapreduce.Job:  map 100% reduce 33%
> >> 13/06/10 08:08:16 INFO mapreduce.Job:  map 100% reduce 35%
> >> 13/06/10 08:08:19 INFO mapreduce.Job:  map 100% reduce 40%
> >> 13/06/10 08:08:22 INFO mapreduce.Job:  map 100% reduce 43%
> >>
> >> Any way, below are my configurations for your reference. Thanks!
> >> (A) core-site.xml
> >> only define 'fs.default.name' and 'hadoop.tmp.dir'
> >>
> >> (B) hdfs-site.xml
> >>   <property>
> >>     <name>dfs.replication</name>
> >>     <value>1</value>
> >>   </property>
> >>
> >>   <property>
> >>     <name>dfs.name.dir</name>
> >>     <value>/opt/hadoop-2.0.4-alpha/temp/hadoop/dfs_name_dir</value>
> >>   </property>
> >>
> >>   <property>
> >>     <name>dfs.data.dir</name>
> >>     <value>/opt/hadoop-2.0.4-alpha/temp/hadoop/dfs_data_dir</value>
> >>   </property>
> >>
> >>   <property>
> >>     <name>dfs.block.size</name>
> >>     <value>134217728</value><!-- 128MB -->
> >>   </property>
> >>
> >>   <property>
> >>     <name>dfs.namenode.handler.count</name>
> >>     <value>64</value>
> >>   </property>
> >>
> >>   <property>
> >>     <name>dfs.datanode.handler.count</name>
> >>     <value>10</value>
> >>   </property>
> >>
> >> (C) mapred-site.xml
> >>   <property>
> >>     <name>mapreduce.cluster.temp.dir</name>
> >>     <value>/opt/hadoop-2.0.4-alpha/temp/hadoop/mapreduce_temp</value>
> >>     <description>No description</description>
> >>     <final>true</final>
> >>   </property>
> >>
> >>   <property>
> >>     <name>mapreduce.cluster.local.dir</name>
> >>
> <value>/opt/hadoop-2.0.4-alpha/temp/hadoop/mapreduce_local_dir</value>
> >>     <description>No description</description>
> >>     <final>true</final>
> >>   </property>
> >>
> >> <property>
> >>   <name>mapreduce.child.java.opts</name>
> >>   <value>-Xmx1000m</value>
> >> </property>
> >>
> >> <property>
> >>     <name>mapreduce.framework.name</name>
> >>     <value>yarn</value>
> >>    </property>
> >>
> >>  <property>
> >>     <name>mapreduce.tasktracker.map.tasks.maximum</name>
> >>     <value>8</value>