Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> what to make of ava.io.IOException: Premeture EOF from inputStream ?

Copy link to this message
what to make of ava.io.IOException: Premeture EOF from inputStream ?

In my map-reduce job, I see following stacktrace in syslog logs of my
map tasks. This repeats at nearly 10 minute intervals for about 4-5
times and eventually map tasks gets completed successfully.
I am not sure what to make of this stacktrace. Are there repeated
trials and then it eventually succeeds? If so, does that *necessarily*
imply there are corrupted blocks in DFS.

2009-07-27 12:28:30,593 WARN org.apache.hadoop.dfs.DFSClient:
Exception while reading from blk_4407619471727385075_668831 of
/data/part-00050 from XXX.XXX.XXX.231:50210: java.io.IOException:
Premeture EOF from inputStream

at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:102)
at org.apache.hadoop.dfs.DFSClient$BlockReader.readChunk(DFSClient.java:996)
at org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:236)

at org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java:191)
at org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:159)
at org.apache.hadoop.dfs.DFSClient$BlockReader.read(DFSClient.java:858)

at org.apache.hadoop.dfs.DFSClient$DFSInputStream.readBuffer(DFSClient.java:1384)
at org.apache.hadoop.dfs.DFSClient$DFSInputStream.read(DFSClient.java:1420)

Another question unrelated to this is I see few map tasks which are
shown as 100% complete but whose status is stll "Running" after 20
minutes. Doesn't 100% complete *necessarily* mean that status should
change to "Complete" within

a minute or two?