|
|
-
Re: FileNotFoundExcepion when getting files from DistributedCacheBarak Yaish 2012-11-22, 20:46
Thanks for the quick response.
I wanted to use DistributedCache to localized the files in interest to all nodes, so which API should I use in order to be able to read all those files, regardless the node running the mapper? On Thu, Nov 22, 2012 at 10:38 PM, Harsh J <[EMAIL PROTECTED]> wrote: > You pointed that you use: > > FSDataInputStream fs = FileSystem.get( context.getConfiguration() ).open( > path ) > > Note that this (FileSystem.get) will return back a HDFS FileSystem by > default and your path is a local one. You can either use simple > java.io.File APIs or use > FileSystem.getLocal(context.getConfiguration()) [1] to get a local > filesystem handle that can look in file:/// FSes rather than hdfs:// > paths. > > [1] > http://hadoop.apache.org/docs/current/api/org/apache/hadoop/fs/FileSystem.html#getLocal(org.apache.hadoop.conf.Configuration) > > On Fri, Nov 23, 2012 at 2:04 AM, Barak Yaish <[EMAIL PROTECTED]> > wrote: > > Hi, > > > > I’ve 2 nodes cluster (v1.04), master and slave. On the master, in > Tool.run() > > we add two files to the DistributedCache using addCacheFile(). Files do > > exist in HDFS. In the Mapper.setup() we want to retrieve those files from > > the cache using FSDataInputStream fs = FileSystem.get( > > context.getConfiguration() ).open( path ). The problem is that for one > file > > a FileNotFoundException is thrown, although the file exists on the slave > > node: > > > > attempt_201211211227_0020_m_000000_2: java.io.FileNotFoundException: File > > does not exist: > > > /somedir/hdp.tmp.dir/mapred/local/taskTracker/distcache/-7769715304990780/master/tmp/analytics/1.csv > > > > ls –l on the slave: > > > > [hduser@slave ~]$ ll > > > /somedir/hdp.tmp.dir/mapred/local/taskTracker/distcache/-7769715304990780/master/tmp/ > > analytics/1.csv > > -rwxr-xr-x 1 hduser hadoop 42701 Nov 22 10:18 > > > /somedir/hdp.tmp.dir/mapred/local/taskTracker/distcache/-7769715304990780/master/tmp/analytics/1.csv > > [hduser@slave ~]$ > > > > My questions are: > > > > Shouldn't all files exist on all nodes? > > What should be done to fix that? > > > > Thanks. > > > > -- > Harsh J > |