Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> efficiency of LocalResources and archives


Copy link to this message
-
efficiency of LocalResources and archives
Suppose that I have a large archive in HDFS, say, containing 500 files and 4GB.  I want to make this available via YARN LocalResource.  The archive doesn't change very often (maybe once per month).  Will YARN optimize for this?  Does the expanded per-node cache persist across application runs (using something like modification time to know if re-expansion is needed)?

If the archive is re-expanded on each node every time the app is launched, should I set the replication factor higher to reduce rack bandwidth?

Thanks
John

NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB