Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> mapred.system.dir


Copy link to this message
-
Re: mapred.system.dir
To expand on Eric's comment: dfs.data.dir is the local filesystem directory
(or directories) that a particular datanode uses to store its slice of the
HDFS data blocks.

so dfs.data.dir might be "/home/hadoop/data/" on some machine; a bunch of
files with inscrutable names like blk_4546857325993894516 will be stored
there. These "blk" files represent chunks of "real" complete user-accessible
files in HDFS-proper.

mapred.system.dir is a filesystem path like "/system/mapred" which is served
by the HDFS, where files used by MapReduce appear. The purpose of the config
file comment is to let you know that you're free to pick a path name like
"/system/mapred" here even though your local Linux machine doesn't have a
path named "/system"; this HDFS path is in a separate (HDFS-specific)
namespace from "/home", "/etc", "/var" and the other various denizens of
your local machine.

- Aaron

On Fri, Feb 12, 2010 at 6:23 AM, Eric Sammer <[EMAIL PROTECTED]> wrote:

> On 2/12/10 8:40 AM, Edson Ramiro wrote:
> > Hi all,
> >
> > I'm setting up a Hadoop Cluster and some doubts have
> >  arisen about hadoop configuration.
> >
> > The Hadoop Cluster Setup [1] says that the mapred.system.dir must
> > be in the HDFS and be accessible from both the server and clients.
> >
> > Where is the HDFS directory? is the dfs.data.dir?
> >
> > should I export by NFS or other protocol the mapred.system.dir to
> > leave it accessible from server and clients?
> >
> > Thanks in advance
> >
> > [1] http://hadoop.apache.org/common/docs/current/cluster_setup.html
> >
> > Edson Ramiro
> >
>
> Edson:
>
> An HDFS file system is a distributed global view controlled by the
> namenode. If a file is "in HDFS" all clients and servers that are
> pointed at the namenode will be able to see everything. This means that
> you don't need to do anything special to export or reveal the
> mapred.system.dir; that's what HDFS does. It's worth reading the HDFS
> Architecture paper on the Hadoop site or the Google GFS paper for
> details on how this all works and how it relates to map reduce.
>
> HTH.
> --
> Eric Sammer
> [EMAIL PROTECTED]
> http://esammer.blogspot.com
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB