|
|
-
Re: Problem using distributed cache
Harsh J 2012-12-06, 17:02
What is your conf object there? Is it job.getConfiguration() or an independent instance?
On Thu, Dec 6, 2012 at 10:29 PM, Peter Cogan <[EMAIL PROTECTED]> wrote: > Hi , > > I want to use the distributed cache to allow my mappers to access data. In > main, I'm using the command > > DistributedCache.addCacheFile(new URI("/user/peter/cacheFile/testCache1"), > conf); > > Where /user/peter/cacheFile/testCache1 is a file that exists in hdfs > > Then, my setup function looks like this: > > public void setup(Context context) throws IOException, InterruptedException{ > Configuration conf = context.getConfiguration(); > Path[] localFiles = DistributedCache.getLocalCacheFiles(conf); > //etc > } > > However, this localFiles array is always null. > > I was initially running on a single-host cluster for testing, but I read > that this will prevent the distributed cache from working. I tried with a > pseudo-distributed, but that didn't work either > > I'm using hadoop 1.0.3 > > thanks Peter > >
-- Harsh J
+
Harsh J 2012-12-06, 17:02
-
Re: Problem using distributed cache
surfer 2012-12-07, 14:49
Hello Peter In my, humble, experience I never get hadoop 1.0.3 to work with distributed cache and the new api (mapreduce). with the old api it works. giovanni
P.S. I already tried the approaches suggested by both Dhaval and Harsh J On 12/06/2012 05:59 PM, Peter Cogan wrote: > > Hi , > > I want to use the distributed cache to allow my mappers to access > data. In main, I'm using the command > > |DistributedCache.addCacheFile(new URI("/user/peter/cacheFile/testCache1"), conf); > | > > Where /user/peter/cacheFile/testCache1 is a file that exists in hdfs > > Then, my setup function looks like this: > > |public void setup(Context context) throws IOException, InterruptedException{ > Configuration conf = context.getConfiguration(); > Path[] localFiles = DistributedCache.getLocalCacheFiles(conf); > //etc > } > | > > However, this localFiles array is always null. > > I was initially running on a single-host cluster for testing, but I > read that this will prevent the distributed cache from working. I > tried with a pseudo-distributed, but that didn't work either > > I'm using hadoop 1.0.3 > > thanks Peter > >
+
surfer 2012-12-07, 14:49
-
Re: Problem using distributed cache
Peter Cogan 2012-12-07, 14:06
Hi,
any thoughts on this would be much appreciated
thanks Peter On Thu, Dec 6, 2012 at 9:29 PM, Peter Cogan <[EMAIL PROTECTED]> wrote:
> Hi, > > It's an instance created at the start of the program like this: > > public static void main(String[] args) throws Exception { > > Configuration conf = new Configuration(); > > > Job job = new Job(conf, "wordcount"); > > > > DistributedCache.addCacheFile(new URI("/user/peter/cacheFile/testCache1"), > conf); > > > > > On Thu, Dec 6, 2012 at 5:02 PM, Harsh J <[EMAIL PROTECTED]> wrote: > >> What is your conf object there? Is it job.getConfiguration() or an >> independent instance? >> >> On Thu, Dec 6, 2012 at 10:29 PM, Peter Cogan <[EMAIL PROTECTED]> >> wrote: >> > Hi , >> > >> > I want to use the distributed cache to allow my mappers to access data. >> In >> > main, I'm using the command >> > >> > DistributedCache.addCacheFile(new >> URI("/user/peter/cacheFile/testCache1"), >> > conf); >> > >> > Where /user/peter/cacheFile/testCache1 is a file that exists in hdfs >> > >> > Then, my setup function looks like this: >> > >> > public void setup(Context context) throws IOException, >> InterruptedException{ >> > Configuration conf = context.getConfiguration(); >> > Path[] localFiles = DistributedCache.getLocalCacheFiles(conf); >> > //etc >> > } >> > >> > However, this localFiles array is always null. >> > >> > I was initially running on a single-host cluster for testing, but I read >> > that this will prevent the distributed cache from working. I tried with >> a >> > pseudo-distributed, but that didn't work either >> > >> > I'm using hadoop 1.0.3 >> > >> > thanks Peter >> > >> > >> >> >> >> -- >> Harsh J >> > >
+
Peter Cogan 2012-12-07, 14:06
-
Re: Problem using distributed cache
bejoy.hadoop@... 2012-12-07, 15:13
Hi Peter
Can you try the following in your code 1. Driver class to implement Tools interface 2. Do a getConfiguration() rather than creating a new conf instance.
DC should be working with the above mentioned modifications to code.
Sent on my BlackBerry® from Vodafone
-----Original Message----- From: Peter Cogan <[EMAIL PROTECTED]> Date: Fri, 7 Dec 2012 14:06:41 To: <[EMAIL PROTECTED]> Reply-To: [EMAIL PROTECTED] Subject: Re: Problem using distributed cache
Hi,
any thoughts on this would be much appreciated
thanks Peter On Thu, Dec 6, 2012 at 9:29 PM, Peter Cogan <[EMAIL PROTECTED]> wrote:
> Hi, > > It's an instance created at the start of the program like this: > > public static void main(String[] args) throws Exception { > > Configuration conf = new Configuration(); > > > Job job = new Job(conf, "wordcount"); > > > > DistributedCache.addCacheFile(new URI("/user/peter/cacheFile/testCache1"), > conf); > > > > > On Thu, Dec 6, 2012 at 5:02 PM, Harsh J <[EMAIL PROTECTED]> wrote: > >> What is your conf object there? Is it job.getConfiguration() or an >> independent instance? >> >> On Thu, Dec 6, 2012 at 10:29 PM, Peter Cogan <[EMAIL PROTECTED]> >> wrote: >> > Hi , >> > >> > I want to use the distributed cache to allow my mappers to access data. >> In >> > main, I'm using the command >> > >> > DistributedCache.addCacheFile(new >> URI("/user/peter/cacheFile/testCache1"), >> > conf); >> > >> > Where /user/peter/cacheFile/testCache1 is a file that exists in hdfs >> > >> > Then, my setup function looks like this: >> > >> > public void setup(Context context) throws IOException, >> InterruptedException{ >> > Configuration conf = context.getConfiguration(); >> > Path[] localFiles = DistributedCache.getLocalCacheFiles(conf); >> > //etc >> > } >> > >> > However, this localFiles array is always null. >> > >> > I was initially running on a single-host cluster for testing, but I read >> > that this will prevent the distributed cache from working. I tried with >> a >> > pseudo-distributed, but that didn't work either >> > >> > I'm using hadoop 1.0.3 >> > >> > thanks Peter >> > >> > >> >> >> >> -- >> Harsh J >> > >
+
bejoy.hadoop@... 2012-12-07, 15:13
-
Re: Problem using distributed cache
Harsh J 2012-12-07, 14:25
Please try using job.getConfiguration() instead of the pre-job conf instance, cause the constructor clones it.
On Fri, Dec 7, 2012 at 7:36 PM, Peter Cogan <[EMAIL PROTECTED]> wrote: > Hi, > > any thoughts on this would be much appreciated > > thanks > Peter > > > On Thu, Dec 6, 2012 at 9:29 PM, Peter Cogan <[EMAIL PROTECTED]> wrote: >> >> Hi, >> >> It's an instance created at the start of the program like this: >> >> public static void main(String[] args) throws Exception { >> >> Configuration conf = new Configuration(); >> >> >> Job job = new Job(conf, "wordcount"); >> >> >> >> DistributedCache.addCacheFile(new URI("/user/peter/cacheFile/testCache1"), >> conf); >> >> >> >> >> On Thu, Dec 6, 2012 at 5:02 PM, Harsh J <[EMAIL PROTECTED]> wrote: >>> >>> What is your conf object there? Is it job.getConfiguration() or an >>> independent instance? >>> >>> On Thu, Dec 6, 2012 at 10:29 PM, Peter Cogan <[EMAIL PROTECTED]> >>> wrote: >>> > Hi , >>> > >>> > I want to use the distributed cache to allow my mappers to access data. >>> > In >>> > main, I'm using the command >>> > >>> > DistributedCache.addCacheFile(new >>> > URI("/user/peter/cacheFile/testCache1"), >>> > conf); >>> > >>> > Where /user/peter/cacheFile/testCache1 is a file that exists in hdfs >>> > >>> > Then, my setup function looks like this: >>> > >>> > public void setup(Context context) throws IOException, >>> > InterruptedException{ >>> > Configuration conf = context.getConfiguration(); >>> > Path[] localFiles = DistributedCache.getLocalCacheFiles(conf); >>> > //etc >>> > } >>> > >>> > However, this localFiles array is always null. >>> > >>> > I was initially running on a single-host cluster for testing, but I >>> > read >>> > that this will prevent the distributed cache from working. I tried with >>> > a >>> > pseudo-distributed, but that didn't work either >>> > >>> > I'm using hadoop 1.0.3 >>> > >>> > thanks Peter >>> > >>> > >>> >>> >>> >>> -- >>> Harsh J >> >> >
-- Harsh J
+
Harsh J 2012-12-07, 14:25
-
Re: Problem using distributed cache
Peter Cogan 2012-12-07, 15:22
Hi Dhaval & Harsh,
thanks for coming back to the thread - you're both right I was doing things in the wrong order. I hadn't realised that Job constructor clones the configuration - that's very interesting!
thanks again Peter On Fri, Dec 7, 2012 at 2:25 PM, Harsh J <[EMAIL PROTECTED]> wrote:
> Please try using job.getConfiguration() instead of the pre-job conf > instance, cause the constructor clones it. > > On Fri, Dec 7, 2012 at 7:36 PM, Peter Cogan <[EMAIL PROTECTED]> wrote: > > Hi, > > > > any thoughts on this would be much appreciated > > > > thanks > > Peter > > > > > > On Thu, Dec 6, 2012 at 9:29 PM, Peter Cogan <[EMAIL PROTECTED]> > wrote: > >> > >> Hi, > >> > >> It's an instance created at the start of the program like this: > >> > >> public static void main(String[] args) throws Exception { > >> > >> Configuration conf = new Configuration(); > >> > >> > >> Job job = new Job(conf, "wordcount"); > >> > >> > >> > >> DistributedCache.addCacheFile(new > URI("/user/peter/cacheFile/testCache1"), > >> conf); > >> > >> > >> > >> > >> On Thu, Dec 6, 2012 at 5:02 PM, Harsh J <[EMAIL PROTECTED]> wrote: > >>> > >>> What is your conf object there? Is it job.getConfiguration() or an > >>> independent instance? > >>> > >>> On Thu, Dec 6, 2012 at 10:29 PM, Peter Cogan <[EMAIL PROTECTED]> > >>> wrote: > >>> > Hi , > >>> > > >>> > I want to use the distributed cache to allow my mappers to access > data. > >>> > In > >>> > main, I'm using the command > >>> > > >>> > DistributedCache.addCacheFile(new > >>> > URI("/user/peter/cacheFile/testCache1"), > >>> > conf); > >>> > > >>> > Where /user/peter/cacheFile/testCache1 is a file that exists in hdfs > >>> > > >>> > Then, my setup function looks like this: > >>> > > >>> > public void setup(Context context) throws IOException, > >>> > InterruptedException{ > >>> > Configuration conf = context.getConfiguration(); > >>> > Path[] localFiles = DistributedCache.getLocalCacheFiles(conf); > >>> > //etc > >>> > } > >>> > > >>> > However, this localFiles array is always null. > >>> > > >>> > I was initially running on a single-host cluster for testing, but I > >>> > read > >>> > that this will prevent the distributed cache from working. I tried > with > >>> > a > >>> > pseudo-distributed, but that didn't work either > >>> > > >>> > I'm using hadoop 1.0.3 > >>> > > >>> > thanks Peter > >>> > > >>> > > >>> > >>> > >>> > >>> -- > >>> Harsh J > >> > >> > > > > > > -- > Harsh J >
+
Peter Cogan 2012-12-07, 15:22
-
Re: Problem using distributed cache
Dhaval Shah 2012-12-07, 14:23
You will need to add the cache file to distributed cache before creating the Job object.. Give that a spin and see if that works Regards, Dhaval ________________________________ From: Peter Cogan <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Sent: Friday, 7 December 2012 9:06 AM Subject: Re: Problem using distributed cache
Hi,
any thoughts on this would be much appreciated
thanks Peter
On Thu, Dec 6, 2012 at 9:29 PM, Peter Cogan <[EMAIL PROTECTED]> wrote:
Hi, > > >It's an instance created at the start of the program like this: > > >public static void main(String[] args) throws Exception { >Configuration conf = new Configuration(); > > >Job job = new Job(conf, "wordcount"); > > > > >DistributedCache.addCacheFile(new URI("/user/peter/cacheFile/testCache1"), conf); > > > > > >On Thu, Dec 6, 2012 at 5:02 PM, Harsh J <[EMAIL PROTECTED]> wrote: > >What is your conf object there? Is it job.getConfiguration() or an >>independent instance? >> >> >>On Thu, Dec 6, 2012 at 10:29 PM, Peter Cogan <[EMAIL PROTECTED]> wrote: >>> Hi , >>> >>> I want to use the distributed cache to allow my mappers to access data. In >>> main, I'm using the command >>> >>> DistributedCache.addCacheFile(new URI("/user/peter/cacheFile/testCache1"), >>> conf); >>> >>> Where /user/peter/cacheFile/testCache1 is a file that exists in hdfs >>> >>> Then, my setup function looks like this: >>> >>> public void setup(Context context) throws IOException, InterruptedException{ >>> Configuration conf = context.getConfiguration(); >>> Path[] localFiles = DistributedCache.getLocalCacheFiles(conf); >>> //etc >>> } >>> >>> However, this localFiles array is always null. >>> >>> I was initially running on a single-host cluster for testing, but I read >>> that this will prevent the distributed cache from working. I tried with a >>> pseudo-distributed, but that didn't work either >>> >>> I'm using hadoop 1.0.3 >>> >>> thanks Peter >>> >>> >> >> >> >>-- >>Harsh J >> >
+
Dhaval Shah 2012-12-07, 14:23
|
|