Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop >> mail # user >> Does hadoop installations need to be at same locations in cluster ?


+
praveenesh kumar 2011-12-23, 12:50
+
Michael Segel 2011-12-23, 12:56
+
praveenesh kumar 2011-12-23, 14:17
+
Michael Segel 2011-12-23, 17:55
Copy link to this message
-
Re: Does hadoop installations need to be at same locations in cluster ?
Agreed that different locations is not a good idea.
However, the question was, can it be done? Yes, with some hacking I suppose.
Do I recommend hacking? No.

But, if you cannot help yourself, then having data nodes in a different
locations per slave: create a hdfs-site.xml per node (enjoy).
For the hadoop installation itself it is a bit more tricky.
Look at bin/hadoop-deamons.sh. It finds the location where it is running
from and assumes that the clients are in the same location.
For further hackery and confusion, look at the HADOOP_SSH_OPTS environment
variable set in hadoop-env.sh. Note that passing HADOOP_CONF_DIR requires
support from the server. The ssh deamon may not accept client-side SendEnv
to avoid LD_* types of environment variables as this opens a security hole.
See settings in /etc/sshd_config on the slaves.
Alternatively you can have a symlink on the client in the same location as
the master pointing to your different location.
Finally you may be able to start hadoop deamons by hand.

Have the correct amount of fun!

Joep

On Fri, Dec 23, 2011 at 9:55 AM, Michael Segel <[EMAIL PROTECTED]>wrote:

>
> Ok,
>
> Here's the thing...
>
> 1) When building the cluster, you want to be consistent.
> 2) Location of $HADOOP_HOME is configurable. So you can place it anywhere.
>
> Putting the software in two different locations isn't a good idea because
> you now have to set it up with a unique configuration per node.
>
> It would be faster and make your life a lot easier by putting the software
> in the same location on *all* machines.
> So my suggestion would be to bite the bullet and rebuild your cluster.
>
> HTH
>
> -Mike
>
>
> > Date: Fri, 23 Dec 2011 19:47:45 +0530
> > Subject: Re: Does hadoop installations need to be at same locations in
> cluster ?
> > From: [EMAIL PROTECTED]
> > To: [EMAIL PROTECTED]
> >
> > What I mean to say is, Does hadoop internally assumes that all
> > installations on each nodes need to be in same location.
> > I was having hadoop installed on different location on 2 different nodes.
> > I configured  hadoop config files to be a part of same cluster.
> > But when I started hadoop on master, I saw it was also searching for
> > hadoop starting scripts in the same location as of master.
> > Do we have any workaround in these kind of situation or do I have to
> > reinstall hadoop again on same location as master.
> >
> > Thanks,
> > Praveenesh
> >
> > On Fri, Dec 23, 2011 at 6:26 PM, Michael Segel
> > <[EMAIL PROTECTED]> wrote:
> > > Sure,
> > > You could do that, but in doing so, you will make your life a living
> hell.
> > > Literally.
> > >
> > > Think about it... You will have to manually manage each nodes config
> files...
> > >
> > > So if something goes wrong you will have a hard time diagnosing the
> issue.
> > >
> > > Why make life harder?
> > >
> > > Why not just do the simple think and make all of your DN the same?
> > >
> > > Sent from my iPhone
> > >
> > > On Dec 23, 2011, at 6:51 AM, "praveenesh kumar" <[EMAIL PROTECTED]>
> wrote:
> > >
> > >> When installing hadoop on slave machines, do we have to install hadoop
> > >> at same locations on each machine ?
> > >> Can we have hadoop installation at different location on different
> > >> machines at same cluster ?
> > >> If yes, what things we have to take care in that case
> > >>
> > >> Thanks,
> > >> Praveenesh
>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB