Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> MapReduce Child don't exit?


Copy link to this message
-
Re: MapReduce Child don't exit?
Thanks for the reply, that's very helpful.

I think it is a bug for DFSClient.
2009/11/17 Jason Venner <[EMAIL PROTECTED]>

> The dfs client code waits until the all of the datanodes that are going to
> hold a replica of your output's blocks have ack'd.
> If you are pausing there, most likely something is wrong in your hdfs
> cluster.
>
>
> On Thu, Nov 12, 2009 at 7:06 AM, Ted Xu <[EMAIL PROTECTED]> wrote:
>
>>  hi all,
>>
>> We are using hadoop-0.19.1 on about 200 nodes. We find there are lots of
>> slaves keep Child process even the job is done.
>>
>> Here is an example, the process is running since "AUGEST 09"!
>>
>>
>>> 1000     24625     1  0 Aug09 ?        00:00:38 (...java... classpath)
>>> org.apache.hadoop.mapred.Child 127.0.0.1 55998
>>> attempt_200908081205_0054_r_000093_0 441920924
>>
>>
>> jstack output for the process is:
>>
>>
>>> 2009-11-12 14:58:59
>>> Full thread dump Java HotSpot(TM) Server VM (11.0-b15 mixed mode):
>>>
>>> "Attach Listener" daemon prio=10 tid=0x08168400 nid=0x457a waiting on
>>> condition [0x00000000..0x00000000]
>>>    java.lang.Thread.State: RUNNABLE
>>>
>>> "Thread-2" daemon prio=10 tid=0x08170400 nid=0x60f8 waiting for monitor
>>> entry [0xa33ad000..0xa33adfd0]
>>>    java.lang.Thread.State: BLOCKED (on object monitor)
>>>         at
>>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.closeInternal(DFSClient.java:3085)
>>>         - waiting to lock <0xa84d12a8> (a
>>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream)
>>>         at
>>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.close(DFSClient.java:3054)
>>>         at
>>> org.apache.hadoop.hdfs.DFSClient$LeaseChecker.close(DFSClient.java:942)
>>>         - locked <0xa84cba48> (a
>>> org.apache.hadoop.hdfs.DFSClient$LeaseChecker)
>>>         at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:209)
>>>         - locked <0xa84cba60> (a org.apache.hadoop.hdfs.DFSClient)
>>>         at
>>> org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem.java:264)
>>>         at
>>> org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:1413)
>>>         - locked <0xa84a1e00> (a org.apache.hadoop.fs.FileSystem$Cache)
>>>         at org.apache.hadoop.fs.FileSystem.closeAll(FileSystem.java:236)
>>>         at
>>> org.apache.hadoop.fs.FileSystem$ClientFinalizer.run(FileSystem.java:221)
>>>         - locked <0xa84a26f0> (a
>>> org.apache.hadoop.fs.FileSystem$ClientFinalizer)
>>>
>>> "SIGTERM handler" daemon prio=10 tid=0x08176800 nid=0x60f6 in
>>> Object.wait() [0xa35ad000..0xa35ae0d0]
>>>    java.lang.Thread.State: WAITING (on object monitor)
>>>         at java.lang.Object.wait(Native Method)
>>>         - waiting on <0xa84a26f0> (a
>>> org.apache.hadoop.fs.FileSystem$ClientFinalizer)
>>>         at java.lang.Thread.join(Thread.java:1143)
>>>         - locked <0xa84a26f0> (a
>>> org.apache.hadoop.fs.FileSystem$ClientFinalizer)
>>>         at java.lang.Thread.join(Thread.java:1196)
>>>         at
>>> java.lang.ApplicationShutdownHooks.run(ApplicationShutdownHooks.java:79)
>>>         at java.lang.Shutdown.runHooks(Shutdown.java:89)
>>>         at java.lang.Shutdown.sequence(Shutdown.java:133)
>>>         at java.lang.Shutdown.exit(Shutdown.java:178)
>>>         - locked <0xa4556020> (a java.lang.Class for java.lang.Shutdown)
>>>         at java.lang.Terminator$1.handle(Terminator.java:35)
>>>         at sun.misc.Signal$1.run(Signal.java:195)
>>>         at java.lang.Thread.run(Thread.java:619)
>>>
>>> "Comm thread for attempt_200908081205_0054_r_000093_0" daemon prio=10
>>> tid=0x083f0000 nid=0x6049 waiting for monitor entry [0xa35fe000..0xa35ff050]
>>>    java.lang.Thread.State: BLOCKED (on object monitor)
>>>         at java.lang.Shutdown.exit(Shutdown.java:178)
>>>         - waiting to lock <0xa4556020> (a java.lang.Class for
>>> java.lang.Shutdown)
>>>         at java.lang.Runtime.exit(Runtime.java:90)
>>>         at java.lang.System.exit(System.java:906)
>>>         at org.apache.hadoop.mapred.Task$1.run(Task.java:430)
Best Regards,

Ted Xu
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB