Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce, mail # user - DistributedCache - why not read directly from HDFS?

Copy link to this message
DistributedCache - why not read directly from HDFS?
Alberto Cordioli 2013-03-23, 14:53
Hi all,

I was not able to find an answer to the following question. If the
question has already been answered please give me the pointer to the
right thread.

Which are actually the differences between read file from HDFS in one
mapper and use DistributedCache.

I saw that with DistributedCache you can give an hdfs path and the
task nodes will get the data on local file system. But which
advantages we have compared with a simple HDFS read with
FSDataInputStream.open() method?

Thank you very much,
Alberto Cordioli