Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> cluster set-up / a few quick questions


Copy link to this message
-
Re: cluster set-up / a few quick questions
On Fri, Oct 26, 2012 at 11:47 AM, Kartashov, Andy
<[EMAIL PROTECTED]> wrote:
> I successfully ran a job on a cluster on foo1 in pseudo-distributed mode and are now trying to try fully-dist'ed one.
>
> a. I created another instance foo2 on EC2.

It seems like you're trying to use the start-dfs.sh style startup
scripts to manually run a cluster on EC2.  This is doable, but it's
not very easy due to the mismatch in expectations between EC2 style
deployments and start-dfs.sh.  Setting up a manually started cluster
requires a bit of up-front work, and EC2 spin-up/spin-down cycles mean
you end up redoing that work frequently.

You might consider using whirr, http://whirr.apache.org/ as a more
automated way of deploying Hadoop clusters on EC2.

Of course, setting up a manual cluster can be a really good way to
understand how all the parts work together, and doing it on EC2 should
work just fine.

> Installed hadoop on it and copied conf/  folder from foo1 to foo2. I created  /hadoop/dfs/data folder on the local linux system on foo2.
>
> b. on foo1 I created file conf/slaves and added:
> localhost
> <hostname-of-foo2>

I'd strongly recommend being consistent with the naming, don't mix
"localhost" and DNS names. EC2 has "ec2.internal" in /etc/resolv.conf
by default, so you can "ping ip-10-42-120-3" and it should work just
fine. Then make conf/master list your first host by name, and make
conf/slaves list all your hosts by name. Note that for small clusters,
running a DN and a NN on a single host is an acceptable compromise and
works OK.

% cat conf/master
ip-10-42-120-3
% cat conf/slaves
ip-10-42-120-3
ip-10-42-115-32
%

You also should make sure that your user account can ssh to all the nodes:
% for h in $(cat conf/slaves); do ssh -oStrictHostKeyChecking=no $h
hostname; done

 - answer "yes" to any "allow untrusted certificate" messages
 - if you get "permission denied" messages you'll need to set up the
authorized_keys properly.
 - after this loop succeeds you should be able to run it again and get
a clean list of hostnames.

> At this point I cannot find an answer on what to do next.
>
> I started NN, DN, SNN, JT, TT on foor1. After I ran "hadoop fsck /user/bar -files -blocks -locations", it showed # of datanode as 1.  I was expecting DN and TT on foo2 to be started by foo1. But it didn’t happen, so I started them myself and tried the the command again. Still  one DD.

You don't need to start the daemons individually, and doing so is very
difficult to get right. I virtually never do so -- I use the
start-dfs.sh script to start the daemons (NN, DN, TT, etc). The
"master" and "slaves" config files are parsed by the start-*.sh
scripts, not by the daemons themselves.  And, the daemons don't start
themselves -- for a manual cluster, the start-*.sh scripts are
responsible. (In a production deployment such as CDH, there is a
/etc/init.d script which is managed by the distro packaging to start
and manage the daemons.)

-andy
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB