Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HDFS >> mail # user >> DFS and the RecordReader

Jay Vyas 2012-12-06, 21:37
Copy link to this message
Re: DFS and the RecordReader

Not sure what you're talking about. RecordReaders, or for that matter,
any DFS InputStream, does not pull data locally before reading it.
Non-data-local reads are streamed over the network like how regular
data local reads are streamed over a local disk.

There is no such logic as the one you seek.

On Fri, Dec 7, 2012 at 3:07 AM, Jay Vyas <[EMAIL PROTECTED]> wrote:
> Hi guys:
> Where and how does a Hadoop's record reader decide wether or not it needs to
> copy a file to local disk ?
> Clearly, since the InputSplit (which has meta data about file inputs) is the
> input to the RecordReader, the RecordReader would have to implement some
> kind of smart decision making ... Im looking for something like
> //Psuedocode
> if(! file.existsLocally())
>    copyFileToDisk(filegetPath());
> return new InputStream(file);
> I've looked here:
> http://grepcode.com/file/repo1.maven.org/maven2/org.jvnet.hudson.hadoop/hadoop-core/0.19.1-hudson-2/org/apache/hadoop/hdfs/DFSClient.java#DFSClient.create%28java.lang.String%2Corg.apache.hadoop.fs.permission.FsPermission%2Cboolean%2Cshort%2Clong%2Corg.apache.hadoop.util.Progressable%2Cint%29
> but don't see anything.
> --
> Jay Vyas
> http://jayunit100.blogspot.com

Harsh J
Jay Vyas 2012-12-07, 03:33