Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> DistributedCache: getLocalCacheFiles() always null


Copy link to this message
-
DistributedCache: getLocalCacheFiles() always null
Hi all,

I am trying to use the DistributedCache with the new Hadoop API.
According to the documentation it seems that nothing change, and the
use is the same as with the old api.
However I am facing some problems. This is the snippet in which I use it:
// setting input/output format classes
....

//DISTRIBUTED CACHE
DistributedCache.addCacheFile(new
Path("/cdr/input/cgi.csv#cgi.csv").toUri(), getConf());
job.waitForCompletion(true);
and in my reducer:

@Override
protected void setup(Context context) throws IOException{
      Path[] localFiles DistributedCache.getLocalCacheFiles(context.getConfiguration());
      ....
}

localFiels is always null. I read that the getLocalCacheFiles() should
be used in configure() method, but the mapper/reducer of the new api
do not have that method.
What's wrong?
I read that the DistributedCache has some troubles if you try to run
your program from a client (e.g., inside an IDE), but I tried also to
run it directly on the cluster.
Thanks.

--
Alberto Cordioli
+
Alberto Cordioli 2012-10-19, 12:49
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB