Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop >> mail # user >> Child JVM, Distributed Cache and Language Embedding


+
Saptarshi Guha 2013-02-13, 05:28
Copy link to this message
-
Re: Child JVM, Distributed Cache and Language Embedding
Hmm,
distributedcache.getLocalCacheArchives
On Tue, Feb 12, 2013 at 9:28 PM, Saptarshi Guha <[EMAIL PROTECTED]>wrote:

> Hello,
>
> I'm bit fuzzy on the details here so appreciate your help.
>
> I am embedding a language into the JVM. My hadoop job will instantiate the
> child JVM once for all tasks assigned (mapred.job.reuse.jvm.num.tasks > -1)
>
> So if a node can run 6 parallel JVMs, it will and these 6 will churn
> through all the tasks assigned to them.
>
> Now, per JVM, the language engine will be instantiated. For this to work,
> I will ship the language distribution to the nodes (the nodes are really
> bare and installing the language on the node is not an option) using the
> distributed cache (as a tar.gz. file).
>
> My understanding is that HadoopMapreduce will unarchive this tgz file and
> then for every task attempt symlink it into the task attempt's working
> folder.
>
> However, for the language engine  to be successfully initialized i need to
> know the location of the unarchived file, a location that will stay
> constant across all task attempts for that child JVM,
>
> Q: How can i infer this location?
>
> Cheers
> Saptarshi
>
>
+
David Boyd 2013-02-13, 13:43