Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS, mail # dev - Question regarding access to different hadoop 2.0 cluster


Copy link to this message
-
Re: Question regarding access to different hadoop 2.0 cluster
Todd Lipcon 2013-11-06, 19:40
We've discussed a few times adding a FailoverProxyProvider which would use
DNS records for this. For example, you'd add a SRV record (or multiple A
records) for the logical name, pointing to the physical hosts backing the
cluster. I think it would help reduce client-side configuration pretty
neatly, though has the disadvantage that your DNS admins need to get in the
loop.

-Todd
On Wed, Nov 6, 2013 at 7:36 AM, Bobby Evans <[EMAIL PROTECTED]> wrote:

> Suresh,
>
> You are correct I did not explain myself very well. If one of the name
> nodes has hardware failure.  In order to avoid updating the configs for
> every single service that talks to HDFS you have to make sure the
> replacement box appears to the network to be exactly the same as the
> original.  This is not impossible as you mentioned.
>
> The more common case when this is problematic is upgrading clusters from
> non-HA to HA, or adding in new HA clusters, because there is no existing
> IP address/config to be copied.  Every time this happens all existing
> services must have new configs pushed to be able to talk to the
> new/updated HDFS. This includes Gateways, RM, Compute Nodes, Oozie
> Servers, etc.
>
> Again, this is not that big of a deal for a small setup, but for a large
> setup it can be painful.
>
> --Bobby
>
> On 11/5/13 4:57 PM, "Suresh Srinivas" <[EMAIL PROTECTED]> wrote:
>
> >On Tue, Nov 5, 2013 at 6:57 AM, Bobby Evans <[EMAIL PROTECTED]> wrote:
> >
> >> But that does present a problem if you have to change the DNS address of
> >> one of the HA namenodes.
> >
> >
> >Not sure what you mean by this? Do you mean hostname of one of the
> >namenode
> >changes?
> >If so, why is this is not a problem for single namenode deployment?. How
> >do
> >applications
> >addressing a namenode in a different cluster handle the change?
> >
> >
> >> It forces you to update the config on all other
> >> clusters that want to talk to it.  If you only have a few clusters that
> >>is
> >> probably not a big deal, but it can be problematic if you have many
> >> different clusters that talk to each other.
> >>
> >> --Bobby
> >>
> >> On 11/4/13 4:15 PM, "lohit" <[EMAIL PROTECTED]> wrote:
> >>
> >> >Thanks Suresh!
> >> >
> >> >
> >> >2013/11/4 Suresh Srinivas <[EMAIL PROTECTED]>
> >> >
> >> >> Lohit,
> >> >>
> >> >> The option you have enumerated at the end is the current way to set
> >>up
> >> >> multi cluster
> >> >> environment. That is, all the client side configurations will include
> >> >>the
> >> >> following:
> >> >> - Logical service names (either for federation or HA)
> >> >> - The corresponding physical namenode addresses information
> >> >>
> >> >> For simpler management, one could use xml include to include an xml
> >> >> document
> >> >> that defines all the namespaces and namenodes.
> >> >>
> >> >> Regards,
> >> >> Suresh
> >> >>
> >> >>
> >> >> On Mon, Nov 4, 2013 at 2:02 PM, lohit <[EMAIL PROTECTED]>
> >> >>wrote:
> >> >>
> >> >> > Hello Devs,
> >> >> >
> >> >> > With hadoop 1.0 when there was single namespace. One could access
> >>any
> >> >> HDFS
> >> >> > cluster using any other hadoop config. Something like this
> >> >> >
> >> >> > hadoop --config /path/to/hadoop-cluster1
> >>hdfs://hadoop-cluster2:8020/
> >> >> >
> >> >> > Since NameNode host and port were passed directly as part of URI,
> >>if
> >> >>hdfs
> >> >> > client version matched, one could talk to different clusters
> >>without
> >> >> > needing to have access to cluster specific configuration.
> >> >> >
> >> >> > With Hadoop 2.0 or HA mode, we only specify logical name for
> >>namenode
> >> >>and
> >> >> > rely on hdfs-site.xml  to resolve logical name to two underlying
> >> >>namenode
> >> >> > hosts.
> >> >> >
> >> >> > So, you cannot do something like
> >> >> > hadoop --config /path/to/hadoop-cluster1
> >> >> > hdfs://hadoop-cluster2-logicalname/
> >> >> >
> >> >> > since /path/to/hadoop-cluster1/hdfs-site.xml do not have
> >>information
> >> >> about
> >> >> > hadoop-cluster2-logicalname's namenodes.
Todd Lipcon
Software Engineer, Cloudera