Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> Re: Distributed Cache


+
Ted Yu 2013-07-09, 22:07
+
Azuryy Yu 2013-07-10, 01:26
Copy link to this message
-
RE: Distributed Cache
Ok using job.addCacheFile() seems to compile correctly.
However, how do I then access the cached file in my Mapper code?  Is there a method that will look for any files in the cache?

Thanks,

Andrew

From: Ted Yu [mailto:[EMAIL PROTECTED]]
Sent: Tuesday, July 09, 2013 6:08 PM
To: [EMAIL PROTECTED]
Subject: Re: Distributed Cache

You should use Job#addCacheFile()

Cheers
On Tue, Jul 9, 2013 at 3:02 PM, Botelho, Andrew <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote:
Hi,

I was wondering if I can still use the DistributedCache class in the latest release of Hadoop (Version 2.0.5).
In my driver class, I use this code to try and add a file to the distributed cache:

import java.net.URI;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.filecache.DistributedCache;
import org.apache.hadoop.fs.*;
import org.apache.hadoop.io.*;
import org.apache.hadoop.mapreduce.*;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

Configuration conf = new Configuration();
DistributedCache.addCacheFile(new URI("file path in HDFS"), conf);
Job job = Job.getInstance();
...

However, I keep getting warnings that the method addCacheFile() is deprecated.
Is there a more current way to add files to the distributed cache?

Thanks in advance,

Andrew

+
Omkar Joshi 2013-07-10, 21:15
+
Botelho, Andrew 2013-07-10, 21:43
+
Omkar Joshi 2013-07-10, 22:47
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB