Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce, mail # user - ConnectionException in container, happens only sometimes


+
Andrei 2013-07-10, 12:02
+
Devaraj k 2013-07-10, 12:19
+
Andrei 2013-07-10, 13:42
+
Omkar Joshi 2013-07-10, 21:39
+
Andrei 2013-07-10, 22:00
Copy link to this message
-
Re: ConnectionException in container, happens only sometimes
Andrei 2013-07-11, 08:26
Here are logs of RM and 2 NMs:

RM (master-host): http://pastebin.com/q4qJP8Ld
NM where AM ran (slave-1-host): http://pastebin.com/vSsz7mjG
NM where slave container ran (slave-2-host): http://pastebin.com/NMFi6gRp

The only related error I've found in them is the following (from RM logs):

...
2013-07-11 07:46:06,225 ERROR
org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService:
AppAttemptId doesnt exist in cache appattempt_1373465780870_0005_000001
2013-07-11 07:46:06,227 WARN org.apache.hadoop.ipc.Server: IPC Server
Responder, call org.apache.hadoop.yarn.api.AMRMProtocolPB.allocate from
10.128.40.184:47101: output error
2013-07-11 07:46:06,228 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 0 on 8030 caught an exception
java.nio.channels.ClosedChannelException
at sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:265)
at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:456)
at org.apache.hadoop.ipc.Server.channelWrite(Server.java:2140)
at org.apache.hadoop.ipc.Server.access$2000(Server.java:108)
at org.apache.hadoop.ipc.Server$Responder.processResponse(Server.java:939)
at org.apache.hadoop.ipc.Server$Responder.doRespond(Server.java:1005)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1747)
2013-07-11 07:46:11,238 INFO org.apache.hadoop.yarn.util.RackResolver:
Resolved my_user to /default-rack
2013-07-11 07:46:11,283 INFO
org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService:
NodeManager from node my_user(cmPort: 59267 httpPort: 8042) registered with
capability: 8192, assigned nodeId my_user:59267
...

Though from stack trace it's hard to tell where this error came from.

Let me know if you need any more information.


On Thu, Jul 11, 2013 at 1:00 AM, Andrei <[EMAIL PROTECTED]> wrote:

> Hi Omkar,
>
> I'm out of office now, so I'll post it as fast as get back there.
>
> Thanks
>
>
> On Thu, Jul 11, 2013 at 12:39 AM, Omkar Joshi <[EMAIL PROTECTED]>wrote:
>
>> can you post RM/NM logs too.?
>>
>> Thanks,
>> Omkar Joshi
>> *Hortonworks Inc.* <http://www.hortonworks.com>
>>
>>