Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> MapReduce Child don't exit?


+
Ted Xu 2009-11-12, 07:06
+
Jason Venner 2009-11-17, 06:30
Copy link to this message
-
Re: MapReduce Child don't exit?
Thanks for the reply, that's very helpful.

I think it is a bug for DFSClient.
2009/11/17 Jason Venner <[EMAIL PROTECTED]>

> The dfs client code waits until the all of the datanodes that are going to
> hold a replica of your output's blocks have ack'd.
> If you are pausing there, most likely something is wrong in your hdfs
> cluster.
>
>
> On Thu, Nov 12, 2009 at 7:06 AM, Ted Xu <[EMAIL PROTECTED]> wrote:
>
>>  hi all,
>>
>> We are using hadoop-0.19.1 on about 200 nodes. We find there are lots of
>> slaves keep Child process even the job is done.
>>
>> Here is an example, the process is running since "AUGEST 09"!
>>
>>
>>> 1000     24625     1  0 Aug09 ?        00:00:38 (...java... classpath)
>>> org.apache.hadoop.mapred.Child 127.0.0.1 55998
>>> attempt_200908081205_0054_r_000093_0 441920924
>>
>>
>> jstack output for the process is:
>>
>>
>>> 2009-11-12 14:58:59
>>> Full thread dump Java HotSpot(TM) Server VM (11.0-b15 mixed mode):
>>>
>>> "Attach Listener" daemon prio=10 tid=0x08168400 nid=0x457a waiting on
>>> condition [0x00000000..0x00000000]
>>>    java.lang.Thread.State: RUNNABLE
>>>
>>> "Thread-2" daemon prio=10 tid=0x08170400 nid=0x60f8 waiting for monitor
>>> entry [0xa33ad000..0xa33adfd0]
>>>    java.lang.Thread.State: BLOCKED (on object monitor)
>>>         at
>>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.closeInternal(DFSClient.java:3085)
>>>         - waiting to lock <0xa84d12a8> (a
>>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream)
>>>         at
>>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.close(DFSClient.java:3054)
>>>         at
>>> org.apache.hadoop.hdfs.DFSClient$LeaseChecker.close(DFSClient.java:942)
>>>         - locked <0xa84cba48> (a
>>> org.apache.hadoop.hdfs.DFSClient$LeaseChecker)
>>>         at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:209)
>>>         - locked <0xa84cba60> (a org.apache.hadoop.hdfs.DFSClient)
>>>         at
>>> org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem.java:264)
>>>         at
>>> org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:1413)
>>>         - locked <0xa84a1e00> (a org.apache.hadoop.fs.FileSystem$Cache)
>>>         at org.apache.hadoop.fs.FileSystem.closeAll(FileSystem.java:236)
>>>         at
>>> org.apache.hadoop.fs.FileSystem$ClientFinalizer.run(FileSystem.java:221)
>>>         - locked <0xa84a26f0> (a
>>> org.apache.hadoop.fs.FileSystem$ClientFinalizer)
>>>
>>> "SIGTERM handler" daemon prio=10 tid=0x08176800 nid=0x60f6 in
>>> Object.wait() [0xa35ad000..0xa35ae0d0]
>>>    java.lang.Thread.State: WAITING (on object monitor)
>>>         at java.lang.Object.wait(Native Method)
>>>         - waiting on <0xa84a26f0> (a
>>> org.apache.hadoop.fs.FileSystem$ClientFinalizer)
>>>         at java.lang.Thread.join(Thread.java:1143)
>>>         - locked <0xa84a26f0> (a
>>> org.apache.hadoop.fs.FileSystem$ClientFinalizer)
>>>         at java.lang.Thread.join(Thread.java:1196)
>>>         at
>>> java.lang.ApplicationShutdownHooks.run(ApplicationShutdownHooks.java:79)
>>>         at java.lang.Shutdown.runHooks(Shutdown.java:89)
>>>         at java.lang.Shutdown.sequence(Shutdown.java:133)
>>>         at java.lang.Shutdown.exit(Shutdown.java:178)
>>>         - locked <0xa4556020> (a java.lang.Class for java.lang.Shutdown)
>>>         at java.lang.Terminator$1.handle(Terminator.java:35)
>>>         at sun.misc.Signal$1.run(Signal.java:195)
>>>         at java.lang.Thread.run(Thread.java:619)
>>>
>>> "Comm thread for attempt_200908081205_0054_r_000093_0" daemon prio=10
>>> tid=0x083f0000 nid=0x6049 waiting for monitor entry [0xa35fe000..0xa35ff050]
>>>    java.lang.Thread.State: BLOCKED (on object monitor)
>>>         at java.lang.Shutdown.exit(Shutdown.java:178)
>>>         - waiting to lock <0xa4556020> (a java.lang.Class for
>>> java.lang.Shutdown)
>>>         at java.lang.Runtime.exit(Runtime.java:90)
>>>         at java.lang.System.exit(System.java:906)
>>>         at org.apache.hadoop.mapred.Task$1.run(Task.java:430)
Best Regards,

Ted Xu