Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Symlinks for cacheArchives


Copy link to this message
-
Re: Symlinks for cacheArchives
1. Hmm, tgz files are unzipped but the name doesn't change.
2. Append "#name'  to be symlinked there.
On Thu, Apr 28, 2011 at 10:23 PM, Saptarshi Guha
<[EMAIL PROTECTED]> wrote:
> Hello,
>
> From the docs (for 0.20) for DistributedCache [1] I'm under the
> impression that .tgz files will be unzipped,untarred and symlinked
> into the
> jobs current dir
>
> However, when running the job, this little fragment[2] reveals ( i
> have called     DistributedCache.createSymlink(config_); just after
> adding the cache components)
>
> Arch=/data01/hadoop/mapred/mapred/taskTracker/distcache/5775566659502863353_-129792898_530471609/a.X.com/user/sguha/tmp/rhipe-hbase.jar
> Arch=/data01/hadoop/mapred/mapred/taskTracker/distcache/5324957355881422466_25039836_529778096/a.X.com/user/sguha/Rdist.tar.gz
> File=/data01/hadoop/mapred/mapred/taskTracker/distcache/1213508244132138160_-278348214_531319237/a.X.com/user/sguha/mscript.sh
>
> But having inspected the ls -r of the working directory , I dont see
> this happening (only mscipt.sh was symlinked, it was added via
> addCacheFile)
>
> ls -lR
> .:
> total 12
> lrwxrwxrwx 1 mapred mapred   90 Apr 28 22:11 job.jar ->
> /data01/hadoop/mapred/mapred/taskTracker/sguha/jobcache/job_201102231451_6814/jars/job.jar
> lrwxrwxrwx 1 mapred mapred  141 Apr 28 22:11 mscript.sh ->
> /data01/hadoop/mapred/mapred/taskTracker/distcache/1213508244132138160_-278348214_531319237/a.X.com/user/sguha/mscript.sh
> drwxr-xr-x 2 mapred mapred 4096 Apr 28 22:11 tmp
> ./tmp:
> total 0
>
> In summary:
>
> - I added via addCacheFile (mscript.sh)  - symlinked into working directory. OK
> - I added a JAR file with some classes I needed - added using
> addArchiveToClassPath and this worked too - OK
> - I added a tgz file hoping it would be  untarred, unzipped and
> symlinked in current folder  (using addCacheArchive) - NOT-OK
>
> Have I missed anything?
>
> Cheers
> Joy
>
>
>
> [1] http://hadoop.apache.org/common/docs/r0.20.0/api/org/apache/hadoop/filecache/DistributedCache.html
> [2]     Path[] localArchives > DistributedCache.getLocalCacheArchives(context.getConfiguration());
>        Path[] localFiles > DistributedCache.getLocalCacheFiles(context.getConfiguration());
>        for(Path p : localArchives) System.out.println("Arch="+p);
>        for(Path p : localFiles) System.out.println("File="+p);
>