Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop, mail # user - Re: MapReduce job failing when a node of cluster is rebooted


Copy link to this message
-
Re: MapReduce job failing when a node of cluster is rebooted
alo alt 2011-12-27, 13:55
Did the DN you've just rebooted connecting to the NN? Mostly the
datanode daemon is'nt running, check it:
ps waux |grep "DataNode" |grep -v "grep"

- ALex

On Tue, Dec 27, 2011 at 2:44 PM, Rajat Goel <[EMAIL PROTECTED]> wrote:
> Yes. Hdfs and Mapred related dirs are set outside of /tmp.
>
> On Tue, Dec 27, 2011 at 6:48 PM, alo alt <[EMAIL PROTECTED]> wrote:
>
>> Hi,
>>
>> did you set the hdfs-related dirs outside of /tmp? Most *ux systems
>> clean them up on reboot.
>>
>> - Alex
>>
>> On Tue, Dec 27, 2011 at 2:09 PM, Rajat Goel <[EMAIL PROTECTED]> wrote:
>> > Hi,
>> >
>> > I have a 7-node setup (1 - Namenode/JobTracker, 6 -
>> Datanodes/TaskTrackers)
>> > running Hadoop version 0.20.203.
>> >
>> > I performed the following test:
>> > Initially cluster is running smoothly. Just before launching a MapReduce
>> > job (about one or two minutes before), I shutdown one of the data nodes
>> > (rebooted the machine). Then my MapReduce job starts but immediately
>> fails
>> > with following messages on stderr:
>> >
>> > WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please
>> > use org.apache.hadoop.log.metrics.EventCounter in all the
>> log4j.properties
>> > files.
>> > WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please
>> > use org.apache.hadoop.log.metrics.EventCounter in all the
>> log4j.properties
>> > files.
>> > WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please
>> > use org.apache.hadoop.log.metrics.EventCounter in all the
>> log4j.properties
>> > files.
>> > WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please
>> > use org.apache.hadoop.log.metrics.EventCounter in all the
>> log4j.properties
>> > files.
>> > NOTICE: Configuration: /device.map    /region.map    /url.map
>> > /data/output/2011/12/26/08
>> >  PS:192.168.100.206:11111    3600    true    Notice
>> > 11/12/26 09:10:26 WARN mapred.JobClient: Use GenericOptionsParser for
>> > parsing the arguments. Applications should implement Tool for the same.
>> > 11/12/26 09:10:26 INFO input.FileInputFormat: Total input paths to
>> process
>> > : 24
>> > 11/12/26 09:10:37 INFO hdfs.DFSClient: Exception in
>> createBlockOutputStream
>> > java.io.IOException: Bad connect ack with firstBadLink as
>> > 192.168.100.5:50010
>> > 11/12/26 09:10:37 INFO hdfs.DFSClient: Abandoning block
>> > blk_-6309642664478517067_35619
>> > 11/12/26 09:10:37 INFO hdfs.DFSClient: Waiting to find target node:
>> > 192.168.100.7:50010
>> > 11/12/26 09:10:44 INFO hdfs.DFSClient: Exception in
>> createBlockOutputStream
>> > java.net.NoRouteToHostException: No route to host
>> > 11/12/26 09:10:44 INFO hdfs.DFSClient: Abandoning block
>> > blk_4129088682008611797_35619
>> > 11/12/26 09:10:53 INFO hdfs.DFSClient: Exception in
>> createBlockOutputStream
>> > java.io.IOException: Bad connect ack with firstBadLink as
>> > 192.168.100.5:50010
>> > 11/12/26 09:10:53 INFO hdfs.DFSClient: Abandoning block
>> > blk_3596375242483863157_35619
>> > 11/12/26 09:11:01 INFO hdfs.DFSClient: Exception in
>> createBlockOutputStream
>> > java.io.IOException: Bad connect ack with firstBadLink as
>> > 192.168.100.5:50010
>> > 11/12/26 09:11:01 INFO hdfs.DFSClient: Abandoning block
>> > blk_724369205729364853_35619
>> > 11/12/26 09:11:07 WARN hdfs.DFSClient: DataStreamer Exception:
>> > java.io.IOException: Unable to create new block.
>> >    at
>> >
>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:3002)
>> >    at
>> >
>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2255)
>> >    at
>> >
>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2446)
>> >
>> > 11/12/26 09:11:07 WARN hdfs.DFSClient: Error Recovery for block
>> > blk_724369205729364853_35619 bad datanode[1] nodes == null
>> > 11/12/26 09:11:07 WARN hdfs.DFSClient: Could not get block locations.
>> > Source file
>> >
>> "/data/hadoop-admin/mapred/staging/admin/.staging/job_201112200923_0292/job.split"

Alexander Lorenz
http://mapredit.blogspot.com

P Think of the environment: please don't print this email unless you
really need to.