Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop >> mail # user >> Child JVM, Distributed Cache and Language Embedding


+
Saptarshi Guha 2013-02-13, 05:28
Copy link to this message
-
Re: Child JVM, Distributed Cache and Language Embedding
Hmm,
distributedcache.getLocalCacheArchives
On Tue, Feb 12, 2013 at 9:28 PM, Saptarshi Guha <[EMAIL PROTECTED]>wrote:

> Hello,
>
> I'm bit fuzzy on the details here so appreciate your help.
>
> I am embedding a language into the JVM. My hadoop job will instantiate the
> child JVM once for all tasks assigned (mapred.job.reuse.jvm.num.tasks > -1)
>
> So if a node can run 6 parallel JVMs, it will and these 6 will churn
> through all the tasks assigned to them.
>
> Now, per JVM, the language engine will be instantiated. For this to work,
> I will ship the language distribution to the nodes (the nodes are really
> bare and installing the language on the node is not an option) using the
> distributed cache (as a tar.gz. file).
>
> My understanding is that HadoopMapreduce will unarchive this tgz file and
> then for every task attempt symlink it into the task attempt's working
> folder.
>
> However, for the language engine  to be successfully initialized i need to
> know the location of the unarchived file, a location that will stay
> constant across all task attempts for that child JVM,
>
> Q: How can i infer this location?
>
> Cheers
> Saptarshi
>
>
+
David Boyd 2013-02-13, 13:43
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB