Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce, mail # dev - killApplication doesn't kill AppMaster


Copy link to this message
-
Re: killApplication doesn't kill AppMaster
Vinod Kumar Vavilapalli 2012-08-30, 00:23

Please attach your jstack dump, may be I can spot something.

Pointer for what you asked: ContainerManagerImpl.stopContainer() -> ContainerImpl.KillTransition -> ContainersLauncher -> ContainerLaunch.cleanupContainer(). Follow the events carefully.

HTH,
+Vinod

On Aug 29, 2012, at 3:28 PM, Bo Wang wrote:

> Hi Vinod,
>
> Thanks for the suggestion. I was involved with some other issues before
> getting back to this one. Sorry for replying late.
>
> I tried to kill the process with "kill -3" but it was not interrupted. Then
> I used "kill -9" which sent a SIGKILL and the process was killed. I checked
> the stderr and used jstack to dump the stack trace. Things look just
> normal. Actually, I simplified my test AM to be just an empty while loop.
>
> I look into the code to find where the SIGKILL is sent in YARN but didn't
> find it. I traced down to NodeManager.stopContainer, but didn't see that.
> Would you mind sending me a pointer to the actual code?
>
> Thanks,
> Bo
>
> On Wed, Aug 22, 2012 at 7:29 PM, Vinod Kumar Vavilapalli <
> [EMAIL PROTECTED]> wrote:
>
>>
>>> I am not sure when to grab the stack trace of the AM. In the
>> stdout/stderr
>>> of AM, no stack trace (or exception) is emitted.
>>
>>
>> You can login to the node and if the process is still alive, you can do a
>> "kill -3" which will dump the threads' status to stderr.
>>
>>
>>> Btw, I am curious how NM kills a container. Does it directly kill the JVM
>>> process?
>>
>>
>> NM directly kills the JVM with a SIGTERM followed by a SIGKILL.
>>
>> BTW, please also check the corresponding NM's logs if there is some
>> exception/error which could mean a bug in NM code.
>>
>> HTH,
>> +Vinod