Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop, mail # user - mapred.system.dir


Copy link to this message
-
Re: mapred.system.dir
Aaron Kimball 2010-02-13, 01:14
To expand on Eric's comment: dfs.data.dir is the local filesystem directory
(or directories) that a particular datanode uses to store its slice of the
HDFS data blocks.

so dfs.data.dir might be "/home/hadoop/data/" on some machine; a bunch of
files with inscrutable names like blk_4546857325993894516 will be stored
there. These "blk" files represent chunks of "real" complete user-accessible
files in HDFS-proper.

mapred.system.dir is a filesystem path like "/system/mapred" which is served
by the HDFS, where files used by MapReduce appear. The purpose of the config
file comment is to let you know that you're free to pick a path name like
"/system/mapred" here even though your local Linux machine doesn't have a
path named "/system"; this HDFS path is in a separate (HDFS-specific)
namespace from "/home", "/etc", "/var" and the other various denizens of
your local machine.

- Aaron

On Fri, Feb 12, 2010 at 6:23 AM, Eric Sammer <[EMAIL PROTECTED]> wrote:

> On 2/12/10 8:40 AM, Edson Ramiro wrote:
> > Hi all,
> >
> > I'm setting up a Hadoop Cluster and some doubts have
> >  arisen about hadoop configuration.
> >
> > The Hadoop Cluster Setup [1] says that the mapred.system.dir must
> > be in the HDFS and be accessible from both the server and clients.
> >
> > Where is the HDFS directory? is the dfs.data.dir?
> >
> > should I export by NFS or other protocol the mapred.system.dir to
> > leave it accessible from server and clients?
> >
> > Thanks in advance
> >
> > [1] http://hadoop.apache.org/common/docs/current/cluster_setup.html
> >
> > Edson Ramiro
> >
>
> Edson:
>
> An HDFS file system is a distributed global view controlled by the
> namenode. If a file is "in HDFS" all clients and servers that are
> pointed at the namenode will be able to see everything. This means that
> you don't need to do anything special to export or reveal the
> mapred.system.dir; that's what HDFS does. It's worth reading the HDFS
> Architecture paper on the Hadoop site or the Google GFS paper for
> details on how this all works and how it relates to map reduce.
>
> HTH.
> --
> Eric Sammer
> [EMAIL PROTECTED]
> http://esammer.blogspot.com
>