|
|
-
Re: Hadoop Cluster setup on EC2 instances [Ubuntu 12.04 x64 based machines]Nitin Pawar 2012-12-02, 18:48
also,
if you want to setup a hadoop cluster on aws, just try using whirr. Basically it does everything for you On Sun, Dec 2, 2012 at 10:12 PM, Harsh J <[EMAIL PROTECTED]> wrote: > Your problem is that your /etc/hosts file has the line: > > 127.0.0.1 nutchcluster1 > > Just delete that line, restart your services. You intend your hostname > "nutchcluster1" to be externally accessible, so aliasing it to the > loopback address (127.0.0.1) is not right. > > On Sun, Dec 2, 2012 at 10:08 PM, A Geek <[EMAIL PROTECTED]> wrote: > > Hi, > > Just to add the version details: I'm running Apache Hadoop release 1.0.4 > > with jdk1.6.0_37 . The underlying Ubuntu 12.04 machine has got 300GB disk > > space and has 1.7GB RAM and is a single core machine. > > > > Regards, > > DW > > > > ________________________________ > > From: [EMAIL PROTECTED] > > To: [EMAIL PROTECTED] > > Subject: Hadoop Cluster setup on EC2 instances [Ubuntu 12.04 x64 based > > machines] > > Date: Sun, 2 Dec 2012 15:55:09 +0000 > > > > > > Hi All, > > I'm trying to setup Hadoop Cluster using 4 machines[4 x Ubuntu 12.04 > x_64]. > > Using the following doc: > > 1. > > > http://titan.softnet.tuc.gr:8082/User:xenia/Page_Title/Hadoop_Cluster_Setup_Tutorial > > > > I'm able to setup hadoop clusters with required configurations. I can see > > that all the required services on master and on slaves nodes are running > as > > required[please see below JPS command output ]. The problem, I'm facing > is > > that, the HDFS and Mapreduce daemons running on Master and can be > accessed > > from Master only, and not from the slave machines. Note that, I've added > > these ports in the EC2 security group to open them. And I can browse the > > master machines UI from web browser, using: http://<machine > > ip>:50070/dfshealth.jsp > > > > > > Now, the problem which I'm facing is , the HDFS as well the jobtracker > both > > are accessible from the master machine[I'm using master as both Namenode > and > > Datanode] but both the ports[hdfs: 54310 and mapreduce: 54320] used for > > these two are not accessible from other slave nodes. > > > > I did: netstat -puntl on master machine and got this: > > > > hadoop@nutchcluster1:~/hadoop$ netstat -puntl > > (Not all processes could be identified, non-owned process info > > will not be shown, you would have to be root to see it all.) > > Active Internet connections (only servers) > > Proto Recv-Q Send-Q Local Address Foreign Address State > > PID/Program name > > tcp 0 0 0.0.0.0:22 0.0.0.0:* > LISTEN > > - > > tcp6 0 0 :::50020 :::* > LISTEN > > 6224/java > > tcp6 0 0 127.0.0.1:54310 :::* > LISTEN > > 6040/java > > tcp6 0 0 127.0.0.1:32776 :::* > LISTEN > > 6723/java > > tcp6 0 0 :::57065 :::* > LISTEN > > 6040/java > > tcp6 0 0 :::50090 :::* > LISTEN > > 6401/java > > tcp6 0 0 :::50060 :::* > LISTEN > > 6723/java > > tcp6 0 0 :::50030 :::* > LISTEN > > 6540/java > > tcp6 0 0 127.0.0.1:54320 :::* > LISTEN > > 6540/java > > tcp6 0 0 :::45747 :::* > LISTEN > > 6401/java > > tcp6 0 0 :::33174 :::* > LISTEN > > 6540/java > > tcp6 0 0 :::50070 :::* > LISTEN > > 6040/java > > tcp6 0 0 :::22 :::* > LISTEN > > - > > tcp6 0 0 :::54424 :::* > LISTEN > > 6224/java > > tcp6 0 0 :::50010 :::* > LISTEN > > 6224/java > > tcp6 0 0 :::50075 :::* > LISTEN > > 6224/java > > udp 0 0 0.0.0.0:68 0.0.0.0:* > > - > > hadoop@nutchcluster1:~/hadoop$ > > > > > > As can be seen in the output, both the HDFS daemon and mapreduce daemons > are > > accessible, but only from 127.0.0.1 and not from 0.0.0.0 [any > machine/slave Nitin Pawar |