Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Re: Why my tests shows Yarn is worse than MRv1 for terasort?


Copy link to this message
-
Re: Why my tests shows Yarn is worse than MRv1 for terasort?
Hi Harsh,

According to above suggestions, I removed the duplication of setting, and
reduce the value of 'yarn.nodemanager.resource.cpu-cores', '
yarn.nodemanager.vcores-pcores-ratio' and '
yarn.nodemanager.resource.memory-mb' to 16, 8 and 12000. Ant then, the
efficiency improved about 18%.  I have questions:

- How to know the container number? Why you say it will be 22 containers
due to a 22 GB memory?
- My machine has 32 GB memory, how many memory is proper to be assigned to
containers?
- In mapred-site.xml, if I set 'mapreduce.framework.name' to be 'yarn',
will other parameters for mapred-site.xml still work in yarn framework?
Like 'mapreduce.task.io.sort.mb' and 'mapreduce.map.sort.spill.percent'

Thanks!

2013/6/8 Harsh J <[EMAIL PROTECTED]>

> Hey Sam,
>
> Did you get a chance to retry with Sandy's suggestions? The config
> appears to be asking NMs to use roughly 22 total containers (as
> opposed to 12 total tasks in MR1 config) due to a 22 GB memory
> resource. This could impact much, given the CPU is still the same for
> both test runs.
>
> On Fri, Jun 7, 2013 at 12:23 PM, Sandy Ryza <[EMAIL PROTECTED]>
> wrote:
> > Hey Sam,
> >
> > Thanks for sharing your results.  I'm definitely curious about what's
> > causing the difference.
> >
> > A couple observations:
> > It looks like you've got yarn.nodemanager.resource.memory-mb in there
> twice
> > with two different values.
> >
> > Your max JVM memory of 1000 MB is (dangerously?) close to the default
> > mapreduce.map/reduce.memory.mb of 1024 MB. Are any of your tasks getting
> > killed for running over resource limits?
> >
> > -Sandy
> >
> >
> > On Thu, Jun 6, 2013 at 10:21 PM, sam liu <[EMAIL PROTECTED]> wrote:
> >>
> >> The terasort execution log shows that reduce spent about 5.5 mins from
> 33%
> >> to 35% as below.
> >> 13/06/10 08:02:22 INFO mapreduce.Job:  map 100% reduce 31%
> >> 13/06/10 08:02:25 INFO mapreduce.Job:  map 100% reduce 32%
> >> 13/06/10 08:02:46 INFO mapreduce.Job:  map 100% reduce 33%
> >> 13/06/10 08:08:16 INFO mapreduce.Job:  map 100% reduce 35%
> >> 13/06/10 08:08:19 INFO mapreduce.Job:  map 100% reduce 40%
> >> 13/06/10 08:08:22 INFO mapreduce.Job:  map 100% reduce 43%
> >>
> >> Any way, below are my configurations for your reference. Thanks!
> >> (A) core-site.xml
> >> only define 'fs.default.name' and 'hadoop.tmp.dir'
> >>
> >> (B) hdfs-site.xml
> >>   <property>
> >>     <name>dfs.replication</name>
> >>     <value>1</value>
> >>   </property>
> >>
> >>   <property>
> >>     <name>dfs.name.dir</name>
> >>     <value>/opt/hadoop-2.0.4-alpha/temp/hadoop/dfs_name_dir</value>
> >>   </property>
> >>
> >>   <property>
> >>     <name>dfs.data.dir</name>
> >>     <value>/opt/hadoop-2.0.4-alpha/temp/hadoop/dfs_data_dir</value>
> >>   </property>
> >>
> >>   <property>
> >>     <name>dfs.block.size</name>
> >>     <value>134217728</value><!-- 128MB -->
> >>   </property>
> >>
> >>   <property>
> >>     <name>dfs.namenode.handler.count</name>
> >>     <value>64</value>
> >>   </property>
> >>
> >>   <property>
> >>     <name>dfs.datanode.handler.count</name>
> >>     <value>10</value>
> >>   </property>
> >>
> >> (C) mapred-site.xml
> >>   <property>
> >>     <name>mapreduce.cluster.temp.dir</name>
> >>     <value>/opt/hadoop-2.0.4-alpha/temp/hadoop/mapreduce_temp</value>
> >>     <description>No description</description>
> >>     <final>true</final>
> >>   </property>
> >>
> >>   <property>
> >>     <name>mapreduce.cluster.local.dir</name>
> >>
> <value>/opt/hadoop-2.0.4-alpha/temp/hadoop/mapreduce_local_dir</value>
> >>     <description>No description</description>
> >>     <final>true</final>
> >>   </property>
> >>
> >> <property>
> >>   <name>mapreduce.child.java.opts</name>
> >>   <value>-Xmx1000m</value>
> >> </property>
> >>
> >> <property>
> >>     <name>mapreduce.framework.name</name>
> >>     <value>yarn</value>
> >>    </property>
> >>
> >>  <property>
> >>     <name>mapreduce.tasktracker.map.tasks.maximum</name>
> >>     <value>8</value>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB