Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce, mail # user - task jvm bootstrapping via distributed cache


Copy link to this message
-
Re: task jvm bootstrapping via distributed cache
Stan Rosenberg 2012-08-03, 16:32
Arun,

I don't believe the symlink is of help.  The symlink is created in the
task's current working directory (cwd), but I don't know what cwd is
when I launch with 'hadoop jar ...'.

Thanks,

stan

On Fri, Aug 3, 2012 at 2:39 AM, Arun C Murthy <[EMAIL PROTECTED]> wrote:
> Stan,
>
>  You can ask TT to create a symlink to your jar shipped via DistCache:
>
> http://hadoop.apache.org/common/docs/r1.0.3/mapred_tutorial.html#DistributedCache
>
>  That should give you what you want.
>
> hth,
> Arun
>
> On Jul 30, 2012, at 3:23 PM, Stan Rosenberg wrote:
>
> Hi,
>
> I am seeking a way to leverage hadoop's distributed cache in order to
> ship jars that are required to bootstrap a task's jvm, i.e., before a
> map/reduce task is launched.
> As a concrete example, let's say that I need to launch with
> '-javaagent:/path/profiler.jar'.  In theory, the task tracker is
> responsible for downloading cached files onto its local filesystem.
> However, the absolute path to a given cached file is not known a
> priori; however, we need the path in order to configure '-javaagent'.
>
> Is this currently possible with the distributed cache? If not, is the
> use case appealing enough to open a jira ticket?
>
> Thanks,
>
> stan
>
>
> --
> Arun C. Murthy
> Hortonworks Inc.
> http://hortonworks.com/
>
>