Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce, mail # user - Execution directory for child process within mapper


Copy link to this message
-
RE: Execution directory for child process within mapper
Devaraj k 2011-09-26, 19:19
Localized distributed cache also can be helpful here, if you can do necessary changes to your code. It locates like this in local directory ${mapred.local.dir}/taskTracker/archive/.

As per your explanation, I feel you can write the mapper in such way that copy the files from your customized location( /home/users/{user}/input/jobname) to the current working directory and then start executing the executable.

I hope this helps. :)
Thanks
Devaraj
________________________________________
From: Joris Poort [[EMAIL PROTECTED]]
Sent: Tuesday, September 27, 2011 12:25 AM
To: [EMAIL PROTECTED]
Subject: Re: Execution directory for child process within mapper

Hi Devaraj,

Thanks for your help - that makes sense.  Is there any way to copy the
local files needed for execution to the mapred.local.dir?
Unfortunately I'm running a local code which I cannot edit - and this
code is the one which assumes these files are available in the same
directory.

Thanks!

Joris

On Mon, Sep 26, 2011 at 11:40 AM, Devaraj k <[EMAIL PROTECTED]> wrote:
> Hi Joris,
>
> You cannot configure the work directory directly. You can configure the local directory with property 'mapred.local.dir', and it will be used further to create the work directory like '${mapred.local.dir}/taskTracker/jobcache/$jobid/$taskid/work'. Based on this, you can relatively refer your local command to execute.
>
> I hope this page will help you to understand the directory structure clearly. http://hadoop.apache.org/common/docs/r0.20.2/mapred_tutorial.html#Directory+Structure
>
>
> Thanks
> Devaraj
> ________________________________________
> From: Joris Poort [[EMAIL PROTECTED]]
> Sent: Monday, September 26, 2011 11:20 PM
> To: mapreduce-user
> Subject: Execution directory for child process within mapper
>
> As part of my Java mapper I have a command executes some standalone
> code on a local slave node. When I run a code it executes fine, unless
> it is trying to access some local files in which case I get the error
> that it cannot locate those files.
>
> Digging a little deeper it seems to be executing from the following directory:
>
>    /data/hadoop/mapred/local/taskTracker/{user}/jobcache/job_201109261253_0023/attempt_201109261253_0023_m_000001_0/work
>
> But I am intending to execute from a local directory where the
> relevant files are located:
>
>    /home/users/{user}/input/jobname
>
> Is there a way in java/hadoop to force the execution from the local
> directory, instead of the jobcache directory automatically created in
> hadoop?
>
> Is there perhaps a better way to go about this?
>
> Any help on this would be greatly appreciated!
>
> Cheers,
>
> Joris
>