Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce, mail # user - Re: Why my tests shows Yarn is worse than MRv1 for terasort?


Copy link to this message
-
Re: Why my tests shows Yarn is worse than MRv1 for terasort?
Sandy Ryza 2013-10-23, 16:40
Based on SLOTS_MILLIS_MAPS, it looks like your map tasks are taking about
three times as long in MR2 as they are in MR1.  This is probably because
you allow twice as many map tasks to run at a time in MR2 (12288/768 = 16).
 Being able to use all the containers isn't necessarily a good thing if you
are oversubscribing your node's resources.  Because of the different way
that MR1 and MR2 view resources, I think it's better to test with
mapred.reduce.slowstart.completed.maps=.99 so that the map and reduce
phases will run separately.

On the other side, it looks like your MR1 has more spilled records than
MR2.  For a fairer comparison, you should set io.sort.record.percent in MR1
to .13, which should improve MR1 performance, but will provide a fairer
comparison (MR2 automatically does this tuning for you).

-Sandy
On Wed, Oct 23, 2013 at 9:22 AM, Jian Fang <[EMAIL PROTECTED]>wrote:

> The number of map slots and reduce slots on each data node for MR1 are 8
> and 3, respectively. Since MR2 could use all containers for either map or
> reduce, I would expect that MR2 is faster.
>
>
> On Wed, Oct 23, 2013 at 8:17 AM, Sandy Ryza <[EMAIL PROTECTED]>wrote:
>
>> How many map and reduce slots are you using per tasktracker in MR1?  How
>> do the average map times compare? (MR2 reports this directly on the web UI,
>> but you can also get a sense in MR1 by scrolling through the map tasks
>> page).  Can you share the counters for MR1?
>>
>> -Sandy
>>
>>
>> On Wed, Oct 23, 2013 at 12:23 AM, Jian Fang <
>> [EMAIL PROTECTED]> wrote:
>>
>>> Unfortunately, turning off JVM reuse still got the same result, i.e.,
>>> about 90 minutes for MR2. I don't think the killed reduces could contribute
>>> to 2 times slowness. There should be something very wrong either in
>>> configuration or code. Any hints?
>>>
>>>
>>>
>>> On Tue, Oct 22, 2013 at 5:50 PM, Jian Fang <
>>> [EMAIL PROTECTED]> wrote:
>>>
>>>> Thanks Sandy. I will try to turn JVM resue off and see what happens.
>>>>
>>>> Yes, I saw quite some exceptions in the task attempts. For instance.
>>>>
>>>>
>>>> 2013-10-20 03:13:58,751 ERROR [main]
>>>> org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
>>>> as:hadoop (auth:SIMPLE) cause:java.nio.channels.ClosedChannelException
>>>> 2013-10-20 03:13:58,752 ERROR [Thread-6]
>>>> org.apache.hadoop.hdfs.DFSClient: Failed to close file
>>>> /1-tb-data/_temporary/1/_temporary/attempt_1382237301855_0001_m_000200_1/part-m-00200
>>>> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException):
>>>> No lease on
>>>> /1-tb-data/_temporary/1/_temporary/attempt_1382237301855_0001_m_000200_1/part-m-00200:
>>>> File does not exist. Holder
>>>> DFSClient_attempt_1382237301855_0001_m_000200_1_872378586_1 does not have
>>>> any open files.
>>>>         at
>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2737)
>>>>         at
>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFileInternal(FSNamesystem.java:2801)
>>>>         at
>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFile(FSNamesystem.java:2783)
>>>>         at
>>>> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.complete(NameNodeRpcServer.java:611)
>>>>         at
>>>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.complete(ClientNamenodeProtocolServerSideTranslatorPB.java:429)
>>>>         at
>>>> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:48077)
>>>>         at
>>>> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:582)
>>>>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
>>>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2048)
>>>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044)
>>>> --
>>>>         at com.sun.proxy.$Proxy10.complete(Unknown Source)