Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> DistributedCache - why not read directly from HDFS?


Copy link to this message
-
DistributedCache - why not read directly from HDFS?
Hi all,

I was not able to find an answer to the following question. If the
question has already been answered please give me the pointer to the
right thread.

Which are actually the differences between read file from HDFS in one
mapper and use DistributedCache.

I saw that with DistributedCache you can give an hdfs path and the
task nodes will get the data on local file system. But which
advantages we have compared with a simple HDFS read with
FSDataInputStream.open() method?

Thank you very much,
Alberto
--
Alberto Cordioli
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB