Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Symlinks for cacheArchives


Copy link to this message
-
Symlinks for cacheArchives
Hello,

>From the docs (for 0.20) for DistributedCache [1] I'm under the
impression that .tgz files will be unzipped,untarred and symlinked
into the
jobs current dir

However, when running the job, this little fragment[2] reveals ( i
have called DistributedCache.createSymlink(config_); just after
adding the cache components)

Arch=/data01/hadoop/mapred/mapred/taskTracker/distcache/5775566659502863353_-129792898_530471609/a.X.com/user/sguha/tmp/rhipe-hbase.jar
Arch=/data01/hadoop/mapred/mapred/taskTracker/distcache/5324957355881422466_25039836_529778096/a.X.com/user/sguha/Rdist.tar.gz
File=/data01/hadoop/mapred/mapred/taskTracker/distcache/1213508244132138160_-278348214_531319237/a.X.com/user/sguha/mscript.sh

But having inspected the ls -r of the working directory , I dont see
this happening (only mscipt.sh was symlinked, it was added via
addCacheFile)

ls -lR
.:
total 12
lrwxrwxrwx 1 mapred mapred   90 Apr 28 22:11 job.jar ->
/data01/hadoop/mapred/mapred/taskTracker/sguha/jobcache/job_201102231451_6814/jars/job.jar
lrwxrwxrwx 1 mapred mapred  141 Apr 28 22:11 mscript.sh ->
/data01/hadoop/mapred/mapred/taskTracker/distcache/1213508244132138160_-278348214_531319237/a.X.com/user/sguha/mscript.sh
drwxr-xr-x 2 mapred mapred 4096 Apr 28 22:11 tmp
./tmp:
total 0

In summary:

- I added via addCacheFile (mscript.sh)  - symlinked into working directory. OK
- I added a JAR file with some classes I needed - added using
addArchiveToClassPath and this worked too - OK
- I added a tgz file hoping it would be  untarred, unzipped and
symlinked in current folder  (using addCacheArchive) - NOT-OK

Have I missed anything?

Cheers
Joy

[1] http://hadoop.apache.org/common/docs/r0.20.0/api/org/apache/hadoop/filecache/DistributedCache.html
[2] Path[] localArchives DistributedCache.getLocalCacheArchives(context.getConfiguration());
Path[] localFiles DistributedCache.getLocalCacheFiles(context.getConfiguration());
for(Path p : localArchives) System.out.println("Arch="+p);
for(Path p : localFiles) System.out.println("File="+p);
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB