Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> task jvm bootstrapping via distributed cache


Copy link to this message
-
Re: task jvm bootstrapping via distributed cache
I am guessing this is either a well-known problem or an edge case.  In
any case, would it be a bad idea to designate predetermined output
paths?
E.g., DistributedCache.addCacheFileInto(uri, conf, outputPath) would
attempt to copy the cached file into the specified path resolving to a
task's local filesystem.

Thanks,

stan

On Mon, Jul 30, 2012 at 6:23 PM, Stan Rosenberg
<[EMAIL PROTECTED]> wrote:
> Hi,
>
> I am seeking a way to leverage hadoop's distributed cache in order to
> ship jars that are required to bootstrap a task's jvm, i.e., before a
> map/reduce task is launched.
> As a concrete example, let's say that I need to launch with
> '-javaagent:/path/profiler.jar'.  In theory, the task tracker is
> responsible for downloading cached files onto its local filesystem.
> However, the absolute path to a given cached file is not known a
> priori; however, we need the path in order to configure '-javaagent'.
>
> Is this currently possible with the distributed cache? If not, is the
> use case appealing enough to open a jira ticket?
>
> Thanks,
>
> stan
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB