Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Execution directory for child process within mapper


Copy link to this message
-
Re: Execution directory for child process within mapper
I had a similar issue - when I needed the same file for each reduce (or map
task) I simply added Java code to the setup method to write a file to ".".
When every map needed different files I wrote the files before calling the
executable. The trick also works when the code writes to a file rather than
stdout

On Mon, Sep 26, 2011 at 12:19 PM, Devaraj k <[EMAIL PROTECTED]> wrote:

> Localized distributed cache also can be helpful here, if you can do
> necessary changes to your code. It locates like this in local directory
> ${mapred.local.dir}/taskTracker/archive/.
>
> As per your explanation, I feel you can write the mapper in such way that
> copy the files from your customized location(
> /home/users/{user}/input/jobname) to the current working directory and then
> start executing the executable.
>
> I hope this helps. :)
>
>
> Thanks
> Devaraj
> ________________________________________
> From: Joris Poort [[EMAIL PROTECTED]]
> Sent: Tuesday, September 27, 2011 12:25 AM
> To: [EMAIL PROTECTED]
> Subject: Re: Execution directory for child process within mapper
>
> Hi Devaraj,
>
> Thanks for your help - that makes sense.  Is there any way to copy the
> local files needed for execution to the mapred.local.dir?
> Unfortunately I'm running a local code which I cannot edit - and this
> code is the one which assumes these files are available in the same
> directory.
>
> Thanks!
>
> Joris
>
> On Mon, Sep 26, 2011 at 11:40 AM, Devaraj k <[EMAIL PROTECTED]> wrote:
> > Hi Joris,
> >
> > You cannot configure the work directory directly. You can configure the
> local directory with property 'mapred.local.dir', and it will be used
> further to create the work directory like
> '${mapred.local.dir}/taskTracker/jobcache/$jobid/$taskid/work'. Based on
> this, you can relatively refer your local command to execute.
> >
> > I hope this page will help you to understand the directory structure
> clearly.
> http://hadoop.apache.org/common/docs/r0.20.2/mapred_tutorial.html#Directory+Structure
> >
> >
> > Thanks
> > Devaraj
> > ________________________________________
> > From: Joris Poort [[EMAIL PROTECTED]]
> > Sent: Monday, September 26, 2011 11:20 PM
> > To: mapreduce-user
> > Subject: Execution directory for child process within mapper
> >
> > As part of my Java mapper I have a command executes some standalone
> > code on a local slave node. When I run a code it executes fine, unless
> > it is trying to access some local files in which case I get the error
> > that it cannot locate those files.
> >
> > Digging a little deeper it seems to be executing from the following
> directory:
> >
> >
>  /data/hadoop/mapred/local/taskTracker/{user}/jobcache/job_201109261253_0023/attempt_201109261253_0023_m_000001_0/work
> >
> > But I am intending to execute from a local directory where the
> > relevant files are located:
> >
> >    /home/users/{user}/input/jobname
> >
> > Is there a way in java/hadoop to force the execution from the local
> > directory, instead of the jobcache directory automatically created in
> > hadoop?
> >
> > Is there perhaps a better way to go about this?
> >
> > Any help on this would be greatly appreciated!
> >
> > Cheers,
> >
> > Joris
> >
>

--
Steven M. Lewis PhD
4221 105th Ave NE
Kirkland, WA 98033
206-384-1340 (cell)
Skype lordjoe_com