Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> RE: cluster set-up / a few quick questions - SOLVED

Kartashov, Andy 2012-11-01, 19:11
Nitin Pawar 2012-11-02, 08:11
Kartashov, Andy 2012-10-26, 16:40
Copy link to this message
Re: cluster set-up / a few quick questions
On Fri, Oct 26, 2012 at 9:40 AM, Kartashov, Andy <[EMAIL PROTECTED]> wrote:
> Gents,

We're not all male here. :)  I prefer "Hadoopers" or "hi all,".

> 1.
> - do you put Master's node <hostname> under fs.default.name in core-site.xml on the slave machines or slaves' hostnames?

Master.  I have a 4-node cluster, named foo1 - foo4. My
fs.default.name is hdfs://foo1.domain.com.

> - do you need to run "sudo -u hdfs hadoop namenode -format" and create /tmp /var folders on the HDFS of the slave machines that will be running only DN and TT or not? Do you still need to create hadoop/dfs/name folder on the slaves?

(The following is the simple answer, for non-HA non-federated HDFS.
You'll want to get the simple example working before trying the
complicated ones.)

No. A cluster has one namenode, running on the machine known as the
master, and the admin must "hadoop namenode -format" on that machine

In my example, I ran "hadoop namenode -format" on foo1.

> 2.
> In hdfs-site.xml for dfs.name.dir & dfs.data.dir properties  we specify  /hadoop/dfs/name /hadoop/dfs/data  being  local linux NFS directories by running command "mkdir -p /hadoop/dfs/data"
> but mapred.system.dir  property is to point to HDFS and not NFS  since we are running "sudo -u hdfs hadoop fs -mkdir /tmp/mapred/system"??
> If so and since it is exactly the same format  /far/boo/baz how does hadoop know which directory is local on NFS or HDFS?

This is very confusing, to be sure!  There are a few places where
paths are implicitly known to be on HDFS rather than a Linux
filesystem path. mapred.system.dir is one of those. This does mean
that given a string that starts with "/tmp/" you can't necessarily
know whether it's a Linux path or a HDFS path without looking at the
larger context.

In the case of mapred.system.dir, the docs are the place to check;
according to cluster_setup.html, mapred.system.dir is "Path on the
HDFS where where the Map/Reduce framework stores system files".


Hope this helps,
Kartashov, Andy 2012-10-26, 19:55
Nitin Pawar 2012-10-27, 05:40
Andy Isaacson 2012-10-26, 21:32