Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> efficiency of LocalResources and archives

Copy link to this message
efficiency of LocalResources and archives
Suppose that I have a large archive in HDFS, say, containing 500 files and 4GB.  I want to make this available via YARN LocalResource.  The archive doesn't change very often (maybe once per month).  Will YARN optimize for this?  Does the expanded per-node cache persist across application runs (using something like modification time to know if re-expansion is needed)?

If the archive is re-expanded on each node every time the app is launched, should I set the replication factor higher to reduce rack bandwidth?