Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce, mail # user - Yarn HDFS and Yarn Exceptions when processing "larger" datasets.


Copy link to this message
-
RE: Yarn HDFS and Yarn Exceptions when processing "larger" datasets.
John Lilley 2013-07-02, 18:35
Blah blah,
Can you build and run the DistributedShell example?  If it does not run correctly this would tend to implicate your configuration.  If it run correctly then your code is suspect.
John
From: blah blah [mailto:[EMAIL PROTECTED]]
Sent: Tuesday, June 25, 2013 6:09 PM
To: [EMAIL PROTECTED]
Subject: Yarn HDFS and Yarn Exceptions when processing "larger" datasets.

Hi All
First let me excuse for the poor thread title but I have no idea how to express the problem in one sentence.
I have implemented new Application Master with the use of Yarn. I am using old Yarn development version. Revision 1437315, from 2013-01-23 (SNAPSHOT 3.0.0). I can not update to current trunk version, as prototype deadline is soon, and I don't have time to include Yarn API changes.
Currently I execute experiments in pseudo-distributed mode, I use guava version 14.0-rc1. I have a problem with Yarn's and HDFS Exceptions for "larger" datasets. My AM works fine and I can execute it without a problem for a debug dataset (1MB size). But when I increase the size of input to 6.8 MB, I am getting the following exceptions:
AM_Exceptions_Stack

Exception in thread "Thread-3" java.lang.reflect.UndeclaredThrowableException
    at org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl.unwrapAndThrowException(YarnRemoteExceptionPBImpl.java:135)
    at org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.allocate(AMRMProtocolPBClientImpl.java:77)
    at org.apache.hadoop.yarn.client.AMRMClientImpl.allocate(AMRMClientImpl.java:194)
    at org.tudelft.ludograph.app.AppMasterContainerRequester.sendContainerAskToRM(AppMasterContainerRequester.java:219)
    at org.tudelft.ludograph.app.AppMasterContainerRequester.run(AppMasterContainerRequester.java:315)
    at java.lang.Thread.run(Thread.java:662)
Caused by: com.google.protobuf.ServiceException: java.io.IOException: Failed on local exception: java.io.IOException: Response is null.; Host Details : local host is: "linux-ljc5.site/127.0.0.1<http://127.0.0.1>"; destination host is: "0.0.0.0":8030;
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:212)
    at $Proxy10.allocate(Unknown Source)
    at org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.allocate(AMRMProtocolPBClientImpl.java:75)
    ... 4 more
Caused by: java.io.IOException: Failed on local exception: java.io.IOException: Response is null.; Host Details : local host is: "linux-ljc5.site/127.0.0.1<http://127.0.0.1>"; destination host is: "0.0.0.0":8030;
    at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:760)
    at org.apache.hadoop.ipc.Client.call(Client.java:1240)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
    ... 6 more
Caused by: java.io.IOException: Response is null.
    at org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:950)
    at org.apache.hadoop.ipc.Client$Connection.run(Client.java:844)
Container_Exception

Exception in thread "org.apache.hadoop.hdfs.SocketCache@6da0d866<mailto:org.apache.hadoop.hdfs.SocketCache@6da0d866>" java.lang.NoSuchMethodError: com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
    at org.apache.hadoop.hdfs.SocketCache.clear(SocketCache.java:257)
    at org.apache.hadoop.hdfs.SocketCache.access$100(SocketCache.java:45)
    at org.apache.hadoop.hdfs.SocketCache$1.run(SocketCache.java:126)
    at java.lang.Thread.run(Thread.java:662)

As I said this problem does not occur for the 1MB input. For the 6MB input nothing is changed except the input dataset. Now a little bit of what am I doing, to give you the context of the problem. My AM starts N (debug 4) containers and each container reads its input data part. When this process is finished I am exchanging parts of input between containers (exchanging IDs of input structures, to provide means for communication between data structures). During the process of exchanging IDs these exceptions occur. I start Netty Server/Client on each container and I use ports 12000-12099 as mean of communicating these IDs.
Any help will be greatly appreciated. Sorry for any typos and if the explanation is not clear just ask for any details you are interested in. Currently it is after 2 AM I hope this will be a valid excuse.
regards
tmp