Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # dev >> Re: problem with HDFS caching in Hadoop 2.3


Copy link to this message
-
Re: problem with HDFS caching in Hadoop 2.3
How was this cluster configured? Run dfsadmin -report to see the aggregate
configured cache capacity as seen by the NN. You need to configure some GBs
of cache on each DN and also raise the ulimit for max locked memory.

I'll also note that you are unlikely to see speedups with most mapreduce
jobs, especially running against TextInputFormat. There are a lot of copies
and string splitting, so it's typically not I/O bound.

The fs -tail command is likely also spending a lot of time on startup
costs, so I wouldn't expect much end-to-end latency savings.

Best,
Andrew
On Sun, Mar 9, 2014 at 5:15 AM, Azuryy Yu <[EMAIL PROTECTED]> wrote: