Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce, mail # user - efficiency of LocalResources and archives


Copy link to this message
-
efficiency of LocalResources and archives
John Lilley 2013-06-06, 20:10
Suppose that I have a large archive in HDFS, say, containing 500 files and 4GB.  I want to make this available via YARN LocalResource.  The archive doesn't change very often (maybe once per month).  Will YARN optimize for this?  Does the expanded per-node cache persist across application runs (using something like modification time to know if re-expansion is needed)?

If the archive is re-expanded on each node every time the app is launched, should I set the replication factor higher to reduce rack bandwidth?

Thanks
John