Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Re: Problem using distributed cache


Copy link to this message
-
Re: Problem using distributed cache
You will need to add the cache file to distributed cache before creating the Job object.. Give that a spin and see if that works
 
Regards,
Dhaval
________________________________
 From: Peter Cogan <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]
Sent: Friday, 7 December 2012 9:06 AM
Subject: Re: Problem using distributed cache
 

Hi,

any thoughts on this would be much appreciated

thanks
Peter

On Thu, Dec 6, 2012 at 9:29 PM, Peter Cogan <[EMAIL PROTECTED]> wrote:

Hi,
>
>
>It's an instance created at the start of the program like this:
>
>
>public static void main(String[] args) throws Exception {
>Configuration conf = new Configuration();
>
>
>Job job = new Job(conf, "wordcount");
>
>
>
>
>DistributedCache.addCacheFile(new URI("/user/peter/cacheFile/testCache1"), conf);
>
>
>
>
>
>On Thu, Dec 6, 2012 at 5:02 PM, Harsh J <[EMAIL PROTECTED]> wrote:
>
>What is your conf object there? Is it job.getConfiguration() or an
>>independent instance?
>>
>>
>>On Thu, Dec 6, 2012 at 10:29 PM, Peter Cogan <[EMAIL PROTECTED]> wrote:
>>> Hi ,
>>>
>>> I want to use the distributed cache to allow my mappers to access data. In
>>> main, I'm using the command
>>>
>>> DistributedCache.addCacheFile(new URI("/user/peter/cacheFile/testCache1"),
>>> conf);
>>>
>>> Where /user/peter/cacheFile/testCache1 is a file that exists in hdfs
>>>
>>> Then, my setup function looks like this:
>>>
>>> public void setup(Context context) throws IOException, InterruptedException{
>>>     Configuration conf = context.getConfiguration();
>>>     Path[] localFiles = DistributedCache.getLocalCacheFiles(conf);
>>>     //etc
>>> }
>>>
>>> However, this localFiles array is always null.
>>>
>>> I was initially running on a single-host cluster for testing, but I read
>>> that this will prevent the distributed cache from working. I tried with a
>>> pseudo-distributed, but that didn't work either
>>>
>>> I'm using hadoop 1.0.3
>>>
>>> thanks Peter
>>>
>>>
>>
>>
>>
>>--
>>Harsh J
>>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB