I am using the TextInputFormat and its associated LineReader. In the
RecordReader for this class,
it reads key and value, using LineReader.
My question is does LineReader hit the disk every time it needs to read a
I notice it uses DataInputStream, does that do some internal buffering?
I guess it would be be performance hit if LineReader read from disk every
time it needs to fetch a line,
so I'm guessing it reads a chunk and parses lines from the chunk, but i
didn't see that happening.
I am using Hadoop 0.20
Any comments would be appreciated.