Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop, mail # user - LineReader, Buffering for FileInputFormat


Copy link to this message
-
LineReader, Buffering for FileInputFormat
Saptarshi Guha 2009-08-09, 22:38
Hello,
I am using the TextInputFormat and its associated LineReader. In the
RecordReader for this class,
it reads key and value, using LineReader.
My question is does LineReader hit the disk every time it needs to read a
line?
I notice it uses DataInputStream, does that do some internal buffering?

I guess it would be be performance hit if LineReader read from disk every
time it needs to fetch a line,
so I'm guessing it reads a chunk and parses lines from the chunk, but i
didn't see that happening.

I am using Hadoop 0.20

Any comments would be appreciated.

Regards
Saptarshi