|
|
+
ESGLinux 2012-12-20, 16:30
-
Re: Question about HA and Federation
Harsh J 2012-12-20, 17:12
Hi ESGLinux, Federation and HA are two distinct features that share some common properties but nothing more. You can turn on HA for any selected Namespace but it is not necessarily needed to be that all Namespaces have HA. Perhaps an example will clear it up for you. I have a local instance that is configured to run several namespaces: ns1, ns2, and ns3 (Federated Namespaces). The namespace ns1 hosts my HBase tables and is critical to me, so I have also turned on HA for this namespace alone. The other two namespaces ns2 and ns3 are used only for regular query jobs so its not yet very important for me to have HA on it. So I run them without HA. Thus I have 4 NameNode processes in my cluster in all, given my design above: (2 NNs under ns1, in HA mode) + (1 NN of ns2) + (1 NN of ns3). On Thu, Dec 20, 2012 at 10:00 PM, ESGLinux <[EMAIL PROTECTED]> wrote: > Hi All, > > I´m going to test a hadoop cluster and I have a doubt about HA and > Federation. > > With federation I Have a NameNode per namespace and with HA I have an Active > NameNode and a standby NameNode. > > so, as I have sevaral namespaces, do I need an Active NameNode and a standby > nameNode per namespace? > > I have read this documentation but It´s not clear for me :-( > > https://ccp.cloudera.com/display/CDH4DOC/Introduction+to+Hadoop+High+Availability> http://hadoop.apache.org/docs/r0.23.0/hadoop-yarn/hadoop-yarn-site/Federation.html> > Thanks in advance > > ESGLinux > -- Harsh J
+
Harsh J 2012-12-20, 17:12
-
Re: Question about HA and Federation
ESGLinux 2012-12-20, 17:39
Hi Harsh, First thank you very much for your answer, following your example: You have: 1 Active NameNode + 1 Passive NameNode (it does the work of the old Secondary NameNode) for NS1 NameSpace (these are 2 diferent machines) 1 NameNode for NS2 1 NameNode for NS3 but what about the Secondary Name Nodes for NS2 and NS3? or I don´t need it? perhaps I´m mixing concepts.... Thanks again, Greetings, ESGLinux 2012/12/20 Harsh J <[EMAIL PROTECTED]> > Hi ESGLinux, > > Federation and HA are two distinct features that share some common > properties but nothing more. You can turn on HA for any selected > Namespace but it is not necessarily needed to be that all Namespaces > have HA. > > Perhaps an example will clear it up for you. > > I have a local instance that is configured to run several namespaces: > ns1, ns2, and ns3 (Federated Namespaces). > The namespace ns1 hosts my HBase tables and is critical to me, so I > have also turned on HA for this namespace alone. > The other two namespaces ns2 and ns3 are used only for regular query > jobs so its not yet very important for me to have HA on it. So I run > them without HA. > > Thus I have 4 NameNode processes in my cluster in all, given my design > above: (2 NNs under ns1, in HA mode) + (1 NN of ns2) + (1 NN of ns3). > > On Thu, Dec 20, 2012 at 10:00 PM, ESGLinux <[EMAIL PROTECTED]> wrote: > > Hi All, > > > > I´m going to test a hadoop cluster and I have a doubt about HA and > > Federation. > > > > With federation I Have a NameNode per namespace and with HA I have an > Active > > NameNode and a standby NameNode. > > > > so, as I have sevaral namespaces, do I need an Active NameNode and a > standby > > nameNode per namespace? > > > > I have read this documentation but It´s not clear for me :-( > > > > > https://ccp.cloudera.com/display/CDH4DOC/Introduction+to+Hadoop+High+Availability> > > http://hadoop.apache.org/docs/r0.23.0/hadoop-yarn/hadoop-yarn-site/Federation.html> > > > Thanks in advance > > > > ESGLinux > > > > > > -- > Harsh J >
+
ESGLinux 2012-12-20, 17:39
-
Re: Question about HA and Federation
Harsh J 2012-12-20, 17:48
Hi, To put it simply: If you use a NameNode, you need a SecondaryNameNode. In HA-mode, a StandbyNameNode acts as a SecondaryNameNode (so you don't need to run an extra). Either way, you definitely need the checkpoint operation happening and being monitored for. On Thu, Dec 20, 2012 at 11:09 PM, ESGLinux <[EMAIL PROTECTED]> wrote: > Hi Harsh, > > First thank you very much for your answer, > > following your example: > > You have: > > 1 Active NameNode + 1 Passive NameNode (it does the work of the old > Secondary NameNode) for NS1 NameSpace (these are 2 diferent machines) > 1 NameNode for NS2 > 1 NameNode for NS3 > > but what about the Secondary Name Nodes for NS2 and NS3? or I don´t need it? > perhaps I´m mixing concepts.... > > Thanks again, > > Greetings, > > ESGLinux > > > > > 2012/12/20 Harsh J <[EMAIL PROTECTED]> >> >> Hi ESGLinux, >> >> Federation and HA are two distinct features that share some common >> properties but nothing more. You can turn on HA for any selected >> Namespace but it is not necessarily needed to be that all Namespaces >> have HA. >> >> Perhaps an example will clear it up for you. >> >> I have a local instance that is configured to run several namespaces: >> ns1, ns2, and ns3 (Federated Namespaces). >> The namespace ns1 hosts my HBase tables and is critical to me, so I >> have also turned on HA for this namespace alone. >> The other two namespaces ns2 and ns3 are used only for regular query >> jobs so its not yet very important for me to have HA on it. So I run >> them without HA. >> >> Thus I have 4 NameNode processes in my cluster in all, given my design >> above: (2 NNs under ns1, in HA mode) + (1 NN of ns2) + (1 NN of ns3). >> >> On Thu, Dec 20, 2012 at 10:00 PM, ESGLinux <[EMAIL PROTECTED]> wrote: >> > Hi All, >> > >> > I´m going to test a hadoop cluster and I have a doubt about HA and >> > Federation. >> > >> > With federation I Have a NameNode per namespace and with HA I have an >> > Active >> > NameNode and a standby NameNode. >> > >> > so, as I have sevaral namespaces, do I need an Active NameNode and a >> > standby >> > nameNode per namespace? >> > >> > I have read this documentation but It´s not clear for me :-( >> > >> > >> > https://ccp.cloudera.com/display/CDH4DOC/Introduction+to+Hadoop+High+Availability>> > >> > http://hadoop.apache.org/docs/r0.23.0/hadoop-yarn/hadoop-yarn-site/Federation.html>> > >> > Thanks in advance >> > >> > ESGLinux >> > >> >> >> >> -- >> Harsh J > > -- Harsh J
+
Harsh J 2012-12-20, 17:48
-
Re: Question about HA and Federation
ESGLinux 2012-12-20, 17:55
Hi again, So finally the number of nodes are these: 1 Active NameNode + 1 Passive NameNode (it does the work of the old Secondary NameNode) for NS1 NameSpace (these are 2 diferent machines) 1 NameNode for NS2 + 1 Secondary NameNode 1 NameNode for NS3 + 1 Secondary NameNode We can say that we need 2 nodes per NameSpace, is that true? Thanks, ESGLinux 2012/12/20 Harsh J <[EMAIL PROTECTED]> > Hi, > > To put it simply: If you use a NameNode, you need a SecondaryNameNode. > In HA-mode, a StandbyNameNode acts as a SecondaryNameNode (so you > don't need to run an extra). > > Either way, you definitely need the checkpoint operation happening and > being monitored for. > > On Thu, Dec 20, 2012 at 11:09 PM, ESGLinux <[EMAIL PROTECTED]> wrote: > > Hi Harsh, > > > > First thank you very much for your answer, > > > > following your example: > > > > You have: > > > > 1 Active NameNode + 1 Passive NameNode (it does the work of the old > > Secondary NameNode) for NS1 NameSpace (these are 2 diferent machines) > > 1 NameNode for NS2 > > 1 NameNode for NS3 > > > > but what about the Secondary Name Nodes for NS2 and NS3? or I don´t need > it? > > perhaps I´m mixing concepts.... > > > > Thanks again, > > > > Greetings, > > > > ESGLinux > > > > > > > > > > 2012/12/20 Harsh J <[EMAIL PROTECTED]> > >> > >> Hi ESGLinux, > >> > >> Federation and HA are two distinct features that share some common > >> properties but nothing more. You can turn on HA for any selected > >> Namespace but it is not necessarily needed to be that all Namespaces > >> have HA. > >> > >> Perhaps an example will clear it up for you. > >> > >> I have a local instance that is configured to run several namespaces: > >> ns1, ns2, and ns3 (Federated Namespaces). > >> The namespace ns1 hosts my HBase tables and is critical to me, so I > >> have also turned on HA for this namespace alone. > >> The other two namespaces ns2 and ns3 are used only for regular query > >> jobs so its not yet very important for me to have HA on it. So I run > >> them without HA. > >> > >> Thus I have 4 NameNode processes in my cluster in all, given my design > >> above: (2 NNs under ns1, in HA mode) + (1 NN of ns2) + (1 NN of ns3). > >> > >> On Thu, Dec 20, 2012 at 10:00 PM, ESGLinux <[EMAIL PROTECTED]> wrote: > >> > Hi All, > >> > > >> > I´m going to test a hadoop cluster and I have a doubt about HA and > >> > Federation. > >> > > >> > With federation I Have a NameNode per namespace and with HA I have an > >> > Active > >> > NameNode and a standby NameNode. > >> > > >> > so, as I have sevaral namespaces, do I need an Active NameNode and a > >> > standby > >> > nameNode per namespace? > >> > > >> > I have read this documentation but It´s not clear for me :-( > >> > > >> > > >> > > https://ccp.cloudera.com/display/CDH4DOC/Introduction+to+Hadoop+High+Availability> >> > > >> > > http://hadoop.apache.org/docs/r0.23.0/hadoop-yarn/hadoop-yarn-site/Federation.html> >> > > >> > Thanks in advance > >> > > >> > ESGLinux > >> > > >> > >> > >> > >> -- > >> Harsh J > > > > > > > > -- > Harsh J >
+
ESGLinux 2012-12-20, 17:55
-
Re: Question about HA and Federation
Harsh J 2012-12-20, 17:57
Yes I think its safe to say that - sorry that I missed out SNNs in my first response (I counted only the regular serving namenodes) :) On Thu, Dec 20, 2012 at 11:25 PM, ESGLinux <[EMAIL PROTECTED]> wrote: > Hi again, > > So finally the number of nodes are these: > > 1 Active NameNode + 1 Passive NameNode (it does the work of the old > Secondary NameNode) for NS1 NameSpace (these are 2 diferent machines) > 1 NameNode for NS2 + 1 Secondary NameNode > 1 NameNode for NS3 + 1 Secondary NameNode > > We can say that we need 2 nodes per NameSpace, is that true? > > Thanks, > > ESGLinux > > > > 2012/12/20 Harsh J <[EMAIL PROTECTED]> >> >> Hi, >> >> To put it simply: If you use a NameNode, you need a SecondaryNameNode. >> In HA-mode, a StandbyNameNode acts as a SecondaryNameNode (so you >> don't need to run an extra). >> >> Either way, you definitely need the checkpoint operation happening and >> being monitored for. >> >> On Thu, Dec 20, 2012 at 11:09 PM, ESGLinux <[EMAIL PROTECTED]> wrote: >> > Hi Harsh, >> > >> > First thank you very much for your answer, >> > >> > following your example: >> > >> > You have: >> > >> > 1 Active NameNode + 1 Passive NameNode (it does the work of the old >> > Secondary NameNode) for NS1 NameSpace (these are 2 diferent machines) >> > 1 NameNode for NS2 >> > 1 NameNode for NS3 >> > >> > but what about the Secondary Name Nodes for NS2 and NS3? or I don´t need >> > it? >> > perhaps I´m mixing concepts.... >> > >> > Thanks again, >> > >> > Greetings, >> > >> > ESGLinux >> > >> > >> > >> > >> > 2012/12/20 Harsh J <[EMAIL PROTECTED]> >> >> >> >> Hi ESGLinux, >> >> >> >> Federation and HA are two distinct features that share some common >> >> properties but nothing more. You can turn on HA for any selected >> >> Namespace but it is not necessarily needed to be that all Namespaces >> >> have HA. >> >> >> >> Perhaps an example will clear it up for you. >> >> >> >> I have a local instance that is configured to run several namespaces: >> >> ns1, ns2, and ns3 (Federated Namespaces). >> >> The namespace ns1 hosts my HBase tables and is critical to me, so I >> >> have also turned on HA for this namespace alone. >> >> The other two namespaces ns2 and ns3 are used only for regular query >> >> jobs so its not yet very important for me to have HA on it. So I run >> >> them without HA. >> >> >> >> Thus I have 4 NameNode processes in my cluster in all, given my design >> >> above: (2 NNs under ns1, in HA mode) + (1 NN of ns2) + (1 NN of ns3). >> >> >> >> On Thu, Dec 20, 2012 at 10:00 PM, ESGLinux <[EMAIL PROTECTED]> wrote: >> >> > Hi All, >> >> > >> >> > I´m going to test a hadoop cluster and I have a doubt about HA and >> >> > Federation. >> >> > >> >> > With federation I Have a NameNode per namespace and with HA I have an >> >> > Active >> >> > NameNode and a standby NameNode. >> >> > >> >> > so, as I have sevaral namespaces, do I need an Active NameNode and a >> >> > standby >> >> > nameNode per namespace? >> >> > >> >> > I have read this documentation but It´s not clear for me :-( >> >> > >> >> > >> >> > >> >> > https://ccp.cloudera.com/display/CDH4DOC/Introduction+to+Hadoop+High+Availability>> >> > >> >> > >> >> > http://hadoop.apache.org/docs/r0.23.0/hadoop-yarn/hadoop-yarn-site/Federation.html>> >> > >> >> > Thanks in advance >> >> > >> >> > ESGLinux >> >> > >> >> >> >> >> >> >> >> -- >> >> Harsh J >> > >> > >> >> >> >> -- >> Harsh J > > -- Harsh J
+
Harsh J 2012-12-20, 17:57
-
Re: Question about HA and Federation
ESGLinux 2012-12-20, 18:03
Thank you very much, your answer have clarified me these concepts very much, I didn't understand how could I mix HA and Federation and how many nodes I need.... Kind Regards, ESGLinux, 2012/12/20 Harsh J <[EMAIL PROTECTED]> > Yes I think its safe to say that - sorry that I missed out SNNs in my > first response (I counted only the regular serving namenodes) :) > > On Thu, Dec 20, 2012 at 11:25 PM, ESGLinux <[EMAIL PROTECTED]> wrote: > > Hi again, > > > > So finally the number of nodes are these: > > > > 1 Active NameNode + 1 Passive NameNode (it does the work of the old > > Secondary NameNode) for NS1 NameSpace (these are 2 diferent machines) > > 1 NameNode for NS2 + 1 Secondary NameNode > > 1 NameNode for NS3 + 1 Secondary NameNode > > > > We can say that we need 2 nodes per NameSpace, is that true? > > > > Thanks, > > > > ESGLinux > > > > > > > > 2012/12/20 Harsh J <[EMAIL PROTECTED]> > >> > >> Hi, > >> > >> To put it simply: If you use a NameNode, you need a SecondaryNameNode. > >> In HA-mode, a StandbyNameNode acts as a SecondaryNameNode (so you > >> don't need to run an extra). > >> > >> Either way, you definitely need the checkpoint operation happening and > >> being monitored for. > >> > >> On Thu, Dec 20, 2012 at 11:09 PM, ESGLinux <[EMAIL PROTECTED]> wrote: > >> > Hi Harsh, > >> > > >> > First thank you very much for your answer, > >> > > >> > following your example: > >> > > >> > You have: > >> > > >> > 1 Active NameNode + 1 Passive NameNode (it does the work of the old > >> > Secondary NameNode) for NS1 NameSpace (these are 2 diferent machines) > >> > 1 NameNode for NS2 > >> > 1 NameNode for NS3 > >> > > >> > but what about the Secondary Name Nodes for NS2 and NS3? or I don´t > need > >> > it? > >> > perhaps I´m mixing concepts.... > >> > > >> > Thanks again, > >> > > >> > Greetings, > >> > > >> > ESGLinux > >> > > >> > > >> > > >> > > >> > 2012/12/20 Harsh J <[EMAIL PROTECTED]> > >> >> > >> >> Hi ESGLinux, > >> >> > >> >> Federation and HA are two distinct features that share some common > >> >> properties but nothing more. You can turn on HA for any selected > >> >> Namespace but it is not necessarily needed to be that all Namespaces > >> >> have HA. > >> >> > >> >> Perhaps an example will clear it up for you. > >> >> > >> >> I have a local instance that is configured to run several namespaces: > >> >> ns1, ns2, and ns3 (Federated Namespaces). > >> >> The namespace ns1 hosts my HBase tables and is critical to me, so I > >> >> have also turned on HA for this namespace alone. > >> >> The other two namespaces ns2 and ns3 are used only for regular query > >> >> jobs so its not yet very important for me to have HA on it. So I run > >> >> them without HA. > >> >> > >> >> Thus I have 4 NameNode processes in my cluster in all, given my > design > >> >> above: (2 NNs under ns1, in HA mode) + (1 NN of ns2) + (1 NN of ns3). > >> >> > >> >> On Thu, Dec 20, 2012 at 10:00 PM, ESGLinux <[EMAIL PROTECTED]> > wrote: > >> >> > Hi All, > >> >> > > >> >> > I´m going to test a hadoop cluster and I have a doubt about HA and > >> >> > Federation. > >> >> > > >> >> > With federation I Have a NameNode per namespace and with HA I have > an > >> >> > Active > >> >> > NameNode and a standby NameNode. > >> >> > > >> >> > so, as I have sevaral namespaces, do I need an Active NameNode and > a > >> >> > standby > >> >> > nameNode per namespace? > >> >> > > >> >> > I have read this documentation but It´s not clear for me :-( > >> >> > > >> >> > > >> >> > > >> >> > > https://ccp.cloudera.com/display/CDH4DOC/Introduction+to+Hadoop+High+Availability> >> >> > > >> >> > > >> >> > > http://hadoop.apache.org/docs/r0.23.0/hadoop-yarn/hadoop-yarn-site/Federation.html> >> >> > > >> >> > Thanks in advance > >> >> > > >> >> > ESGLinux > >> >> > > >> >> > >> >> > >> >> > >> >> -- > >> >> Harsh J > >> > > >> > > >> > >> > >> > >> -- > >> Harsh J > > > > > > > > -- > Harsh J >
+
ESGLinux 2012-12-20, 18:03
-
Re: Question about HA and Federation
Harsh J 2012-12-20, 18:26
Btw, you can co-locate NameNodes (unique namespace ones) onto the same machine if you need to - the configs easily allow this via rpc/http port specifiers. On Thu, Dec 20, 2012 at 11:33 PM, ESGLinux <[EMAIL PROTECTED]> wrote: > Thank you very much, > > your answer have clarified me these concepts very much, > > I didn't understand how could I mix HA and Federation and how many nodes I > need.... > > Kind Regards, > > ESGLinux, > > 2012/12/20 Harsh J <[EMAIL PROTECTED]> >> >> Yes I think its safe to say that - sorry that I missed out SNNs in my >> first response (I counted only the regular serving namenodes) :) >> >> On Thu, Dec 20, 2012 at 11:25 PM, ESGLinux <[EMAIL PROTECTED]> wrote: >> > Hi again, >> > >> > So finally the number of nodes are these: >> > >> > 1 Active NameNode + 1 Passive NameNode (it does the work of the old >> > Secondary NameNode) for NS1 NameSpace (these are 2 diferent machines) >> > 1 NameNode for NS2 + 1 Secondary NameNode >> > 1 NameNode for NS3 + 1 Secondary NameNode >> > >> > We can say that we need 2 nodes per NameSpace, is that true? >> > >> > Thanks, >> > >> > ESGLinux >> > >> > >> > >> > 2012/12/20 Harsh J <[EMAIL PROTECTED]> >> >> >> >> Hi, >> >> >> >> To put it simply: If you use a NameNode, you need a SecondaryNameNode. >> >> In HA-mode, a StandbyNameNode acts as a SecondaryNameNode (so you >> >> don't need to run an extra). >> >> >> >> Either way, you definitely need the checkpoint operation happening and >> >> being monitored for. >> >> >> >> On Thu, Dec 20, 2012 at 11:09 PM, ESGLinux <[EMAIL PROTECTED]> wrote: >> >> > Hi Harsh, >> >> > >> >> > First thank you very much for your answer, >> >> > >> >> > following your example: >> >> > >> >> > You have: >> >> > >> >> > 1 Active NameNode + 1 Passive NameNode (it does the work of the old >> >> > Secondary NameNode) for NS1 NameSpace (these are 2 diferent machines) >> >> > 1 NameNode for NS2 >> >> > 1 NameNode for NS3 >> >> > >> >> > but what about the Secondary Name Nodes for NS2 and NS3? or I don´t >> >> > need >> >> > it? >> >> > perhaps I´m mixing concepts.... >> >> > >> >> > Thanks again, >> >> > >> >> > Greetings, >> >> > >> >> > ESGLinux >> >> > >> >> > >> >> > >> >> > >> >> > 2012/12/20 Harsh J <[EMAIL PROTECTED]> >> >> >> >> >> >> Hi ESGLinux, >> >> >> >> >> >> Federation and HA are two distinct features that share some common >> >> >> properties but nothing more. You can turn on HA for any selected >> >> >> Namespace but it is not necessarily needed to be that all Namespaces >> >> >> have HA. >> >> >> >> >> >> Perhaps an example will clear it up for you. >> >> >> >> >> >> I have a local instance that is configured to run several >> >> >> namespaces: >> >> >> ns1, ns2, and ns3 (Federated Namespaces). >> >> >> The namespace ns1 hosts my HBase tables and is critical to me, so I >> >> >> have also turned on HA for this namespace alone. >> >> >> The other two namespaces ns2 and ns3 are used only for regular query >> >> >> jobs so its not yet very important for me to have HA on it. So I run >> >> >> them without HA. >> >> >> >> >> >> Thus I have 4 NameNode processes in my cluster in all, given my >> >> >> design >> >> >> above: (2 NNs under ns1, in HA mode) + (1 NN of ns2) + (1 NN of >> >> >> ns3). >> >> >> >> >> >> On Thu, Dec 20, 2012 at 10:00 PM, ESGLinux <[EMAIL PROTECTED]> >> >> >> wrote: >> >> >> > Hi All, >> >> >> > >> >> >> > I´m going to test a hadoop cluster and I have a doubt about HA and >> >> >> > Federation. >> >> >> > >> >> >> > With federation I Have a NameNode per namespace and with HA I have >> >> >> > an >> >> >> > Active >> >> >> > NameNode and a standby NameNode. >> >> >> > >> >> >> > so, as I have sevaral namespaces, do I need an Active NameNode and >> >> >> > a >> >> >> > standby >> >> >> > nameNode per namespace? >> >> >> > >> >> >> > I have read this documentation but It´s not clear for me :-( >> >> >> > >> >> >> > >> >> >> > >> >> >> > >> >> >> > https://ccp.cloudera.com/display/CDH4DOC/Introduction+to+Hadoop+High+AvailabilityHarsh J
+
Harsh J 2012-12-20, 18:26
|
|