|
|
-
Re: Execution directory for child process within mapperSteve Lewis 2011-09-26, 19:27
I had a similar issue - when I needed the same file for each reduce (or map
task) I simply added Java code to the setup method to write a file to ".". When every map needed different files I wrote the files before calling the executable. The trick also works when the code writes to a file rather than stdout On Mon, Sep 26, 2011 at 12:19 PM, Devaraj k <[EMAIL PROTECTED]> wrote: > Localized distributed cache also can be helpful here, if you can do > necessary changes to your code. It locates like this in local directory > ${mapred.local.dir}/taskTracker/archive/. > > As per your explanation, I feel you can write the mapper in such way that > copy the files from your customized location( > /home/users/{user}/input/jobname) to the current working directory and then > start executing the executable. > > I hope this helps. :) > > > Thanks > Devaraj > ________________________________________ > From: Joris Poort [[EMAIL PROTECTED]] > Sent: Tuesday, September 27, 2011 12:25 AM > To: [EMAIL PROTECTED] > Subject: Re: Execution directory for child process within mapper > > Hi Devaraj, > > Thanks for your help - that makes sense. Is there any way to copy the > local files needed for execution to the mapred.local.dir? > Unfortunately I'm running a local code which I cannot edit - and this > code is the one which assumes these files are available in the same > directory. > > Thanks! > > Joris > > On Mon, Sep 26, 2011 at 11:40 AM, Devaraj k <[EMAIL PROTECTED]> wrote: > > Hi Joris, > > > > You cannot configure the work directory directly. You can configure the > local directory with property 'mapred.local.dir', and it will be used > further to create the work directory like > '${mapred.local.dir}/taskTracker/jobcache/$jobid/$taskid/work'. Based on > this, you can relatively refer your local command to execute. > > > > I hope this page will help you to understand the directory structure > clearly. > http://hadoop.apache.org/common/docs/r0.20.2/mapred_tutorial.html#Directory+Structure > > > > > > Thanks > > Devaraj > > ________________________________________ > > From: Joris Poort [[EMAIL PROTECTED]] > > Sent: Monday, September 26, 2011 11:20 PM > > To: mapreduce-user > > Subject: Execution directory for child process within mapper > > > > As part of my Java mapper I have a command executes some standalone > > code on a local slave node. When I run a code it executes fine, unless > > it is trying to access some local files in which case I get the error > > that it cannot locate those files. > > > > Digging a little deeper it seems to be executing from the following > directory: > > > > > /data/hadoop/mapred/local/taskTracker/{user}/jobcache/job_201109261253_0023/attempt_201109261253_0023_m_000001_0/work > > > > But I am intending to execute from a local directory where the > > relevant files are located: > > > > /home/users/{user}/input/jobname > > > > Is there a way in java/hadoop to force the execution from the local > > directory, instead of the jobcache directory automatically created in > > hadoop? > > > > Is there perhaps a better way to go about this? > > > > Any help on this would be greatly appreciated! > > > > Cheers, > > > > Joris > > > -- Steven M. Lewis PhD 4221 105th Ave NE Kirkland, WA 98033 206-384-1340 (cell) Skype lordjoe_com |