Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> cluster set-up / a few quick questions


Copy link to this message
-
Re: cluster set-up / a few quick questions - SOLVED
you can get the script from hadoop codebase at
http://svn.apache.org/viewcvs.cgi/hadoop/common<http://svn.apache.org/viewcvs.cgi/hadoop/common/trunk/>
On Fri, Nov 2, 2012 at 12:41 AM, Kartashov, Andy <[EMAIL PROTECTED]>wrote:

> People,
>
> While I did not find start-balancer.sh script on my machine I successfully
> utilized the following command:
>
> "$hadoop balancer -threshold 10" and achieved  the exact same result.
>
> One issue remains. Controlling start/stop  daemons of the slaves through
> the master. Somehow I don't have dfs-start/stop.sh nor dfs-start-all.sh
> script on my machine either.  For now, I am starting  dfs and mapreduce
> daemons on each slave manually and individually.
>
> Can someone post the content of the script star-all.sh so I could utilize
> it for my environment.
>
> Thanks,
> AK47
>
>
> -----Original Message-----
> From: Kartashov, Andy
> Sent: Friday, October 26, 2012 3:56 PM
> To: [EMAIL PROTECTED]
> Subject: RE: cluster set-up / a few quick questions - SOLVED
>
> Hadoopers,
>
> The problem was in EC2 security.  While I could passwordlessly ssh into
> another node and back I could not telnet to it due to EC2 firewall.  Needed
> to open ports for the NN and JT.  :)
>
> Now I can see 2  DNs running "hadoop fsck "  and can also -ls into NN from
> the slave. Sweet!!!
>
> Is this possible to balance data over DNs without copying them with
>  hadoop -put command? I read about bin/start-balancer.sh somewhere but
> cannot find it on my current hadoop installation.
> Besides, is balancing data over DN going to improve perfomance of MR job?
>
> Cheers,
> Happy Hadooping.
>
> -----Original Message-----
> From: Nitin Pawar [mailto:[EMAIL PROTECTED]]
> Sent: Friday, October 26, 2012 3:18 PM
> To: [EMAIL PROTECTED]
> Subject: Re: cluster set-up / a few quick questions
>
> questions
>
> 1) Have you setup password less ssh between both hosts for the user who
> owns the hadoop processes (or root)
> 2) If answer to questions 1 is yes, how did you start NN, JT DN and TT
> 3) If you started them one by one, there is no reason running a command on
> one node will execute it on other.
>
>
> On Sat, Oct 27, 2012 at 12:17 AM, Kartashov, Andy <[EMAIL PROTECTED]>
> wrote:
> > Andy, many thanks.
> >
> > I am stuck here now so please put me in the right direction.
> >
> > I successfully ran a job on a cluster on foo1 in pseudo-distributed mode
> and are now trying to try fully-dist'ed one.
> >
> > a. I created another instance foo2 on EC2. Installed hadoop on it and
> copied conf/  folder from foo1 to foo2. I created  /hadoop/dfs/data folder
> on the local linux system on foo2.
> >
> > b. on foo1 I created file conf/slaves and added:
> > localhost
> > <hostname-of-foo2>
> >
> > At this point I cannot find an answer on what to do next.
> >
> > I started NN, DN, SNN, JT, TT on foor1. After I ran "hadoop fsck
> /user/bar -files -blocks -locations", it showed # of datanode as 1.  I was
> expecting DN and TT on foo2 to be started by foo1. But it didn't happen, so
> I started them myself and tried the the command again. Still  one DD.
> > I realise that boo2 has no data at this point but I could not find
> bin/start-balancer.sh script to help me to balance data over to DD from
> foo1 to foo2.
> >
> > What do I do next?
> >
> > Thanks
> > AK
> >
> > -----Original Message-----
> > From: Andy Isaacson [mailto:[EMAIL PROTECTED]]
> > Sent: Friday, October 26, 2012 2:21 PM
> > To: [EMAIL PROTECTED]
> > Subject: Re: cluster set-up / a few quick questions
> >
> > On Fri, Oct 26, 2012 at 9:40 AM, Kartashov, Andy <[EMAIL PROTECTED]>
> wrote:
> >> Gents,
> >
> > We're not all male here. :)  I prefer "Hadoopers" or "hi all,".
> >
> >> 1.
> >> - do you put Master's node <hostname> under fs.default.name in
> core-site.xml on the slave machines or slaves' hostnames?
> >
> > Master.  I have a 4-node cluster, named foo1 - foo4. My fs.default.nameis hdfs://
> foo1.domain.com.
> >
> >> - do you need to run "sudo -u hdfs hadoop namenode -format" and create

Nitin Pawar
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB