Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # user >> Difference between HDFS and local filesystem

Copy link to this message
Difference between HDFS and local filesystem
Hi Users,
I am kind of new to MapReduce programming I am trying to understand the
integration between MapReduce and HDFS.
I could understand MapReduce can use HDFS for data access. But is
possible not to use HDFS at all and run MapReduce programs?
HDFS does file replication and partitioning. But if I use the following
command to run the Example MaxTemperature

  bin/hadoop jar /usr/local/hadoop/maxtemp.jar MaxTemperature
file:///usr/local/ncdcinput/sample.txt file:///usr/local/out4

instead of

  bin/hadoop jar /usr/local/hadoop/maxtemp.jar MaxTemperature
usr/local/ncdcinput/sample.txt usr/local/out4     ->> this will use hdfs
file system.

it uses local file system files and writing to local file system when I
run in pseudo distributed mode. Since it is single node there is no
problem of non local data.
What happens in a fully distributed mode. Will the files be copied to
other machines or will it throw errors? will the files be replicated and
will they be partitioned for running MapReduce if i use Localfile system?

Can someone please explain.