|
|
+
Jay Vyas 2012-12-06, 21:37
-
Re: DFS and the RecordReaderHarsh J 2012-12-06, 22:15
Hi,
Not sure what you're talking about. RecordReaders, or for that matter, any DFS InputStream, does not pull data locally before reading it. Non-data-local reads are streamed over the network like how regular data local reads are streamed over a local disk. There is no such logic as the one you seek. On Fri, Dec 7, 2012 at 3:07 AM, Jay Vyas <[EMAIL PROTECTED]> wrote: > Hi guys: > > Where and how does a Hadoop's record reader decide wether or not it needs to > copy a file to local disk ? > > Clearly, since the InputSplit (which has meta data about file inputs) is the > input to the RecordReader, the RecordReader would have to implement some > kind of smart decision making ... Im looking for something like > > //Psuedocode > if(! file.existsLocally()) > copyFileToDisk(filegetPath()); > > return new InputStream(file); > > I've looked here: > > http://grepcode.com/file/repo1.maven.org/maven2/org.jvnet.hudson.hadoop/hadoop-core/0.19.1-hudson-2/org/apache/hadoop/hdfs/DFSClient.java#DFSClient.create%28java.lang.String%2Corg.apache.hadoop.fs.permission.FsPermission%2Cboolean%2Cshort%2Clong%2Corg.apache.hadoop.util.Progressable%2Cint%29 > > but don't see anything. > > -- > Jay Vyas > http://jayunit100.blogspot.com -- Harsh J +
Jay Vyas 2012-12-07, 03:33
|