Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HDFS >> mail # dev >> FSDataInputStream.read returns -1 with growing file and never continues reading


+
Christoph Rupp 2012-12-20, 12:21
Copy link to this message
-
Re: FSDataInputStream.read returns -1 with growing file and never continues reading
Hi Christoph,

If you use sync/hflush/hsync, the new length of data is only seen by a
new reader, not an existent reader. The "workaround" you've done
exactly how we've implemented the "fs -tail <file>" utility. See code
for that at http://svn.apache.org/viewvc/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/Tail.java?view=markup
(Note the looping at ~74).

On Thu, Dec 20, 2012 at 5:51 PM, Christoph Rupp <[EMAIL PROTECTED]> wrote:
> Hi,
>
> I am experiencing an unexpected situation where FSDataInputStream.read()
> returns -1 while reading data from a file that another process still appends
> to. According to the documentation read() should never return -1 but throw
> Exceptions on errors. In addition, there's more data available, and read()
> definitely should not fail.
>
> The problem gets worse because the FSDataInputStream is not able to recover
> from this. If it once returns -1 then it will always return -1, even if the
> file continues growing.
>
> If, at the same time, other Java processes read other HDFS files, they will
> also return -1 immediately after opening the file. It smells like this error
> gets propagated to other client processes as well.
>
> I found a workaround: close the FSDataInputStream, open it again and then
> seek to the previous position. And then reading works fine.
>
> Another problem that i have seen is that the FSDataInputStream returns -1
> when reaching EOF. It will never return 0 (which i would expect when
> reaching EOF).
>
> I use CDH 4.1.2, but also saw this with CDH 3u5. I have attached samples to
> reproduce this.
>
> My cluster consists of 4 machines; 1 namenode and 3 datanodes. I run my
> tests on the namenode machine. there are no other HDFS users, and the load
> that is generated by my tests is fairly low, i would say.
>
> One process writes to 6 files simultaneously, but with a 5 sec sleep between
> each write. It uses an FSDataOutputStream, and after writing data it calls
> sync(). Each write() appends 8 mb; it stops when the file grows to 100 mb.
>
> Six processes read files; each process reads one file. At first each reader
> loops till the file exists. If it does then it opens the FSDataInputStream
> and starts reading. Usually the first process returns the first 8 MB in the
> file before it starts returning -1. But the other processes immediately
> return -1 without reading any data. I start the 6 reader processes before i
> start the writer.
>
> Search HdfsReader.java for "WORKAROUND" and remove the comments; this will
> reopen the FSDataInputStream after -1 is returned, and then everything
> works.
>
> Sources are attached.
>
> This is a very basic scenario and i wonder if i'm doing anything wrong or if
> i found an HDFS bug.
>
> bye
> Christoph
>

--
Harsh J
+
Christoph Rupp 2012-12-20, 18:32
+
Colin McCabe 2012-12-27, 20:37