Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HDFS >> mail # user >> New bie questions re: namenode and masters


+
Chao Huang 2012-05-24, 11:30
Copy link to this message
-
Re: New bie questions re: namenode and masters
Hi Chao,

What documentation are you reading? This is pretty accurate :
http://hadoop.apache.org/common/docs/r0.20.203.0/hdfs_design.html

The NameNode is indeed responsible for the metadata. And all the datanodes
report to the NameNode (so they are all slaves). You are right, the data
blocks are stored on the DataNodes. Perhaps I am lacking knowledge of the
history, but as of now there's no "master server". All read write requests
on files are directed at the Namenode from where they get redirected to the
appropriate DataNode holding the block.

So your configuration for replication factor 3 would look like:

in conf/core-site.xml:
    fs.default.name = hdfs://machineAAA:54321/

in conf/slaves:
    machineBBB
    machineCCC
    machineDDD
    machineEEE
    ....possibly a lot more
Hope this helps
Ravi
On Thu, May 24, 2012 at 6:30 AM, Chao Huang <[EMAIL PROTECTED]> wrote:

> Hello experts,
>
> I'm new to hdfs/hadoop.  After reading the hdfs documents, I'm getting
> confused by the differences between a namenode and a master server.  It's
> my understanding that the namenode is responsible for managing metadata,
> while the master-replica group (which is comprised by a number of
> datanodes) stores the actual data blocks.  In the master-replica group, the
> master server accepts read/write requests, and load balances (or routes)
> read requests to the appropriate replica. In other words, we should
> configure the namenode and master server on two different physical machines
> in a production environment, right?  Is this a correct assumption?
>
> One other question about HDFS cluster setup:
>
> - requirements:  one namenode, replication factor = 3, in a production
> environment.
>
> how would the topology look like?  Can I configure as follows?
>
>
> in conf/core-site.xml:
>     fs.default.name = hdfs://machineAAA:54321/
>
> in conf/masters:
>     machineBBB
>
> in conf/slaves:
>     machineCCC
>     machineDDD
>
>
> Can someone please confirm and/or comment?
>
> Sorry for my new bie questions. Thanks for the help.
>
> Chao
>
+
Chao Huang 2012-05-24, 13:21
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB