Jay Vyas 2012-12-06, 21:37
Not sure what you're talking about. RecordReaders, or for that matter,
any DFS InputStream, does not pull data locally before reading it.
Non-data-local reads are streamed over the network like how regular
data local reads are streamed over a local disk.
There is no such logic as the one you seek.
On Fri, Dec 7, 2012 at 3:07 AM, Jay Vyas <[EMAIL PROTECTED]> wrote:
> Hi guys:
> Where and how does a Hadoop's record reader decide wether or not it needs to
> copy a file to local disk ?
> Clearly, since the InputSplit (which has meta data about file inputs) is the
> input to the RecordReader, the RecordReader would have to implement some
> kind of smart decision making ... Im looking for something like
> if(! file.existsLocally())
> return new InputStream(file);
> I've looked here:
> but don't see anything.
> Jay Vyas
Jay Vyas 2012-12-07, 03:33