Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Re: Yarn -- one of the daemons getting killed

Copy link to this message
Re: Yarn -- one of the daemons getting killed
Hi Jeff,

  I have run the resource manager in the foreground without nohup and here
are the messages when it was killed, it says it is "Killed" but doesn't say

13/12/17 03:14:54 INFO capacity.CapacityScheduler: Application
appattempt_1387266015651_0258_000001 released container
container_1387266015651_0258_01_000003 on node: host: isredeng:36576
#containers=2 available=7936 used=256 with event: FINISHED
13/12/17 03:14:54 INFO rmcontainer.RMContainerImpl:
container_1387266015651_0258_01_000005 Container Transitioned from ACQUIRED
On Mon, Dec 16, 2013 at 11:10 PM, Jeff Stuckman <[EMAIL PROTECTED]> wrote:

>  What if you open the daemons in a "screen" session rather than running
> them in the background -- for example, run "yarn resourcemanager". Then you
> can see exactly when they terminate, and hopefully why.
>    *From: *Krishna Kishore Bonagiri
> *Sent: *Monday, December 16, 2013 6:20 AM
> *Subject: *Re: Yarn -- one of the daemons getting killed
>  Hi Vinod,
>   Yes, I am running on Linux.
>  I was actually searching for a corresponding message in /var/log/messages
> to confirm that OOM killed my daemons, but could not find any corresponding
> messages there! According to the following link, it looks like if it is a
> memory issue, I should see a messages even if OOM is disabled, but I don't
> see it.
>  http://www.redhat.com/archives/taroon-list/2007-August/msg00006.html
>    And, is memory consumption more in case of two node cluster than a
> single node one? Also, I see this problem only when I give "*" as the node
> name.
>    One other thing I suspected was the allowed number of user processes,
> I increased that to 31000 from 1024 but that also didn't help.
>  Thanks,
> Kishore
> On Fri, Dec 13, 2013 at 11:51 PM, Vinod Kumar Vavilapalli <
>> Yes, that is what I suspect. That is why I asked if everything is on a
>> single node. If you are running linux, linux OOM killer may be shooting
>> things down. When it happens, you will see something like "'killed process"
>> in system's syslog.
>>    Thanks,
>> +Vinod
>>  On Dec 13, 2013, at 4:52 AM, Krishna Kishore Bonagiri <
>> [EMAIL PROTECTED]> wrote:
>>  Vinod,
>>   One more thing I observed is that, my Client which submits Application
>> Master one after another continuously also gets killed sometimes. So, it is
>> always any of the Java Processes that is getting killed. Does it indicate
>> some excessive memory usage by them or something like that, that is causing
>> them die? If so, how can we resolve this kind of issue?
>>  Thanks,
>> Kishore
>> On Fri, Dec 13, 2013 at 10:16 AM, Krishna Kishore Bonagiri <
>> [EMAIL PROTECTED]> wrote:
>>> No, I am running on 2 node cluster.
>>> On Fri, Dec 13, 2013 at 1:52 AM, Vinod Kumar Vavilapalli <
>>> [EMAIL PROTECTED]> wrote:
>>>> Is all of this on a single node?
>>>>   Thanks,
>>>> +Vinod
>>>>  On Dec 12, 2013, at 3:26 AM, Krishna Kishore Bonagiri <
>>>> [EMAIL PROTECTED]> wrote:
>>>>  Hi,
>>>>   I am running a small application on YARN (2.2.0) in a loop of 500
>>>> times, and while doing so one of the daemons, node manager, resource
>>>> manager, or data node is getting killed (I mean disappearing) at a random
>>>> point. I see no information in the corresponding log files. How can I know
>>>> why is it happening so?
>>>>   And, one more observation is that, this is happening only when I am
>>>> using "*" for node name in the container requests, otherwise when I used a
>>>> specific node name, everything is fine.
>>>>  Thanks,
>>>> Kishore
>>>> NOTICE: This message is intended for the use of the individual or
>>>> entity to which it is addressed and may contain information that is
>>>> confidential, privileged and exempt from disclosure under applicable law.