Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Symlinks for cacheArchives


Copy link to this message
-
Re: Symlinks for cacheArchives
1. Hmm, tgz files are unzipped but the name doesn't change.
2. Append "#name'  to be symlinked there.
On Thu, Apr 28, 2011 at 10:23 PM, Saptarshi Guha
<[EMAIL PROTECTED]> wrote:
> Hello,
>
> From the docs (for 0.20) for DistributedCache [1] I'm under the
> impression that .tgz files will be unzipped,untarred and symlinked
> into the
> jobs current dir
>
> However, when running the job, this little fragment[2] reveals ( i
> have called     DistributedCache.createSymlink(config_); just after
> adding the cache components)
>
> Arch=/data01/hadoop/mapred/mapred/taskTracker/distcache/5775566659502863353_-129792898_530471609/a.X.com/user/sguha/tmp/rhipe-hbase.jar
> Arch=/data01/hadoop/mapred/mapred/taskTracker/distcache/5324957355881422466_25039836_529778096/a.X.com/user/sguha/Rdist.tar.gz
> File=/data01/hadoop/mapred/mapred/taskTracker/distcache/1213508244132138160_-278348214_531319237/a.X.com/user/sguha/mscript.sh
>
> But having inspected the ls -r of the working directory , I dont see
> this happening (only mscipt.sh was symlinked, it was added via
> addCacheFile)
>
> ls -lR
> .:
> total 12
> lrwxrwxrwx 1 mapred mapred   90 Apr 28 22:11 job.jar ->
> /data01/hadoop/mapred/mapred/taskTracker/sguha/jobcache/job_201102231451_6814/jars/job.jar
> lrwxrwxrwx 1 mapred mapred  141 Apr 28 22:11 mscript.sh ->
> /data01/hadoop/mapred/mapred/taskTracker/distcache/1213508244132138160_-278348214_531319237/a.X.com/user/sguha/mscript.sh
> drwxr-xr-x 2 mapred mapred 4096 Apr 28 22:11 tmp
> ./tmp:
> total 0
>
> In summary:
>
> - I added via addCacheFile (mscript.sh)  - symlinked into working directory. OK
> - I added a JAR file with some classes I needed - added using
> addArchiveToClassPath and this worked too - OK
> - I added a tgz file hoping it would be  untarred, unzipped and
> symlinked in current folder  (using addCacheArchive) - NOT-OK
>
> Have I missed anything?
>
> Cheers
> Joy
>
>
>
> [1] http://hadoop.apache.org/common/docs/r0.20.0/api/org/apache/hadoop/filecache/DistributedCache.html
> [2]     Path[] localArchives > DistributedCache.getLocalCacheArchives(context.getConfiguration());
>        Path[] localFiles > DistributedCache.getLocalCacheFiles(context.getConfiguration());
>        for(Path p : localArchives) System.out.println("Arch="+p);
>        for(Path p : localFiles) System.out.println("File="+p);
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB