Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Container allocation fails randomly


Copy link to this message
-
Re: Container allocation fails randomly
Can you give more information? logs (complete) will help a lot around this
time frame. Are the containers getting assigned via scheduler? is it
failing when node manager tries to start container? I clearly see the
diagnostic message is empty but do you see anything in NM logs? Also if
there were running containers on the machine before launching new ones..
then are they killed? or they are still hanging around? can you also try
applying patch "https://issues.apache.org/jira/browse/YARN-1053" ? and
check if you can see any message?

Thanks,
Omkar Joshi
*Hortonworks Inc.* <http://www.hortonworks.com>
On Thu, Sep 12, 2013 at 6:15 AM, Krishna Kishore Bonagiri <
[EMAIL PROTECTED]> wrote:

> Hi,
>   I am using 2.1.0-beta and have seen container allocation failing
> randomly even when running the same application in a loop. I know that the
> cluster has enough resources to give, because it gave the resources for the
> same application all the other times in the loop and ran it successfully.
>
>    I have observed a lot of the following kind of messages in the node
> manager's log whenever such failure happens, any clues as to why it happens?
>
> 2013-09-12 08:54:36,204 INFO
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Sending
> out status for container: container_id { app_attempt_id { application_id {
> id: 2 cluster_timestamp: 1378990400253 } attemptId: 1 } id: 1 } state:
> C_RUNNING diagnostics: "" exit_status: -1000
> 2013-09-12 08:54:37,220 INFO
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Sending
> out status for container: container_id { app_attempt_id { application_id {
> id: 2 cluster_timestamp: 1378990400253 } attemptId: 1 } id: 1 } state:
> C_RUNNING diagnostics: "" exit_status: -1000
> 2013-09-12 08:54:38,231 INFO
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Sending
> out status for container: container_id { app_attempt_id { application_id {
> id: 2 cluster_timestamp: 1378990400253 } attemptId: 1 } id: 1 } state:
> C_RUNNING diagnostics: "" exit_status: -1000
> 2013-09-12 08:54:39,239 INFO
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Sending
> out status for container: container_id { app_attempt_id { application_id {
> id: 2 cluster_timestamp: 1378990400253 } attemptId: 1 } id: 1 } state:
> C_RUNNING diagnostics: "" exit_status: -1000
> 2013-09-12 08:54:40,267 INFO
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Sending
> out status for container: container_id { app_attempt_id { application_id {
> id: 2 cluster_timestamp: 1378990400253 } attemptId: 1 } id: 1 } state:
> C_RUNNING diagnostics: "" exit_status: -1000
> 2013-09-12 08:54:41,275 INFO
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Sending
> out status for container: container_id { app_attempt_id { application_id {
> id: 2 cluster_timestamp: 1378990400253 } attemptId: 1 } id: 1 } state:
> C_RUNNING diagnostics: "" exit_status: -1000
> 2013-09-12 08:54:42,283 INFO
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Sending
> out status for container: container_id { app_attempt_id { application_id {
> id: 2 cluster_timestamp: 1378990400253 } attemptId: 1 } id: 1 } state:
> C_RUNNING diagnostics: "" exit_status: -1000
> 2013-09-12 08:54:43,289 INFO
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Sending
> out status for container: container_id { app_attempt_id { application_id {
> id: 2 cluster_timestamp: 1378990400253 } attemptId: 1 } id: 1 } state:
> C_RUNNING diagnostics: "" exit_status: -1000
>
>
> Thanks,
> Kishore
>

--
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to
which it is addressed and may contain information that is confidential,
privileged and exempt from disclosure under applicable law. If the reader
of this message is not the intended recipient, you are hereby notified that
any printing, copying, dissemination, distribution, disclosure or
forwarding of this communication is strictly prohibited. If you have
received this communication in error, please contact the sender immediately
and delete it from your system. Thank You.