|
|
-
Upgrading namenode/secondary node hardware
MilleBii 2011-06-14, 21:01
I want/need to upgrade my namenode/secondary node hardware. Actually also acts as one of the datanodes.
Could not find any how-to guides. So what is the process to switch from one hardware to the next.
1. For HDFS data : is it just a matter of copying all the hdfs data from old server to new server. 2. what about the decommissioning procedure of data node, is it necessary in that case ? 3.For MapRed: need to change the master in cluster configuration files
Any help or pointer welcomed !
-- -MilleBii-
+
MilleBii 2011-06-14, 21:01
-
Re: Upgrading namenode/secondary node hardware
Steve Loughran 2011-06-15, 10:18
On 14/06/11 22:01, MilleBii wrote: > I want/need to upgrade my namenode/secondary node hardware. Actually also > acts as one of the datanodes. > > Could not find any how-to guides. > So what is the process to switch from one hardware to the next. > > 1. For HDFS data : is it just a matter of copying all the hdfs data from old > server to new server.
yes, put it in the same place on your HA storage and you may not even need to reconfigure it. If you didn't shut down the filesystem cleanly, you'll need to replay the edit logs.
> 2. what about the decommissioning procedure of data node, is it necessary in > that case ?
You shouldn't need to. This is no different from handling failover of a namenode, which you ought to try from time to time anyway, with two common tactics -have ready-to-go replacement servers with the same hostname/IP and shared storage -have ready-to-go replacement servers with different hostnames, then with your cluster management tools bounce the workers into a new configuration.
> 3.For MapRed: need to change the master in cluster configuration files
I'd give the new boxes the same hostnames and IPAddresses as before, and nothing else will notice. And I recommend having good cluster management tooling anyway, of course.
+
Steve Loughran 2011-06-15, 10:18
-
Re: Upgrading namenode/secondary node hardware
MilleBii 2011-06-15, 14:54
Thx.
#1 don't understand the "edit logs" remark. #2 good & nice #3 my provider will give me a server with a different IP, so I will have to change all /etc/hosts to point to the new master. But I don't need to change the master/slaves files indeed.
2011/6/15 Steve Loughran <[EMAIL PROTECTED]>
> On 14/06/11 22:01, MilleBii wrote: > >> I want/need to upgrade my namenode/secondary node hardware. Actually also >> acts as one of the datanodes. >> >> Could not find any how-to guides. >> So what is the process to switch from one hardware to the next. >> >> 1. For HDFS data : is it just a matter of copying all the hdfs data from >> old >> server to new server. >> > > yes, put it in the same place on your HA storage and you may not even need > to reconfigure it. If you didn't shut down the filesystem cleanly, you'll > need to replay the edit logs. > > > 2. what about the decommissioning procedure of data node, is it necessary >> in >> that case ? >> > > You shouldn't need to. This is no different from handling failover of a > namenode, which you ought to try from time to time anyway, with two common > tactics > -have ready-to-go replacement servers with the same hostname/IP and shared > storage > -have ready-to-go replacement servers with different hostnames, then with > your cluster management tools bounce the workers into a new configuration. > > > 3.For MapRed: need to change the master in cluster configuration files >> > > I'd give the new boxes the same hostnames and IPAddresses as before, and > nothing else will notice. And I recommend having good cluster management > tooling anyway, of course. > > -- -MilleBii-
+
MilleBii 2011-06-15, 14:54
-
Re: Upgrading namenode/secondary node hardware
MilleBii 2011-06-15, 14:55
Do you have a recommendation for a good cluster management tooling ?
2011/6/15 MilleBii <[EMAIL PROTECTED]>
> Thx. > > #1 don't understand the "edit logs" remark. > #2 good & nice > #3 my provider will give me a server with a different IP, so I will have to > change all /etc/hosts to point to the new master. But I don't need to change > the master/slaves files indeed. > > > 2011/6/15 Steve Loughran <[EMAIL PROTECTED]> > >> On 14/06/11 22:01, MilleBii wrote: >> >>> I want/need to upgrade my namenode/secondary node hardware. Actually also >>> acts as one of the datanodes. >>> >>> Could not find any how-to guides. >>> So what is the process to switch from one hardware to the next. >>> >>> 1. For HDFS data : is it just a matter of copying all the hdfs data from >>> old >>> server to new server. >>> >> >> yes, put it in the same place on your HA storage and you may not even need >> to reconfigure it. If you didn't shut down the filesystem cleanly, you'll >> need to replay the edit logs. >> >> >> 2. what about the decommissioning procedure of data node, is it necessary >>> in >>> that case ? >>> >> >> You shouldn't need to. This is no different from handling failover of a >> namenode, which you ought to try from time to time anyway, with two common >> tactics >> -have ready-to-go replacement servers with the same hostname/IP and >> shared storage >> -have ready-to-go replacement servers with different hostnames, then with >> your cluster management tools bounce the workers into a new configuration. >> >> >> 3.For MapRed: need to change the master in cluster configuration files >>> >> >> I'd give the new boxes the same hostnames and IPAddresses as before, and >> nothing else will notice. And I recommend having good cluster management >> tooling anyway, of course. >> >> > > > -- > -MilleBii- >
-- -MilleBii-
+
MilleBii 2011-06-15, 14:55
-
Re: Upgrading namenode/secondary node hardware
Steve Loughran 2011-06-16, 11:49
On 15/06/11 15:54, MilleBii wrote: > Thx. > > #1 don't understand the "edit logs" remark.
well, that's something you need to work on as its the key to keeping your cluster working. The edit log is the journal of changes made to a namenode, which gets streamed to HDD and your secondary Namenode. After a NN restart, it has to replay all changes since the last checkpoint to get its directory structure up to date. Lose the edit log and you may as well reformat the disks.
+
Steve Loughran 2011-06-16, 11:49
-
Re: Upgrading namenode/secondary node hardware
MilleBii 2011-06-16, 13:19
But if my Filesystem is up & running fine... do I have to worry at all or will the copy (ftp transfer) of hdfs will be enough.
2011/6/16 Steve Loughran <[EMAIL PROTECTED]>
> On 15/06/11 15:54, MilleBii wrote: > >> Thx. >> >> #1 don't understand the "edit logs" remark. >> > > well, that's something you need to work on as its the key to keeping your > cluster working. The edit log is the journal of changes made to a namenode, > which gets streamed to HDD and your secondary Namenode. After a NN restart, > it has to replay all changes since the last checkpoint to get its directory > structure up to date. Lose the edit log and you may as well reformat the > disks. >
-- -MilleBii-
+
MilleBii 2011-06-16, 13:19
-
Re: Upgrading namenode/secondary node hardware
Steve Loughran 2011-06-17, 09:30
On 16/06/11 14:19, MilleBii wrote: > But if my Filesystem is up& running fine... do I have to worry at all or > will the copy (ftp transfer) of hdfs will be enough. >
I'm not going to make any predictions there as if/when things go wrong
-you do need to shut down the FS before the move -you ought to get the edit logs replayed before the move -you may want to try experimenting with copying the namenode data and bringing up the namenode (without any datanodes connected to, so it comes up in safe mode), to make sure everything works.
I'd also worry that if you aren't familiar with the edit log, you may need to spend some time learning the subtle details of namenode journalling, replaying, backup and restoration, and what the secondary namenode does. It's easy to bring up a cluster and get overconfident that it works, right up to the moment it stops working. Experiment with your cluster's and teams' failure handling before you really need it
> > 2011/6/16 Steve Loughran<[EMAIL PROTECTED]> > >> On 15/06/11 15:54, MilleBii wrote: >> >>> Thx. >>> >>> #1 don't understand the "edit logs" remark. >>> >> >> well, that's something you need to work on as its the key to keeping your >> cluster working. The edit log is the journal of changes made to a namenode, >> which gets streamed to HDD and your secondary Namenode. After a NN restart, >> it has to replay all changes since the last checkpoint to get its directory >> structure up to date. Lose the edit log and you may as well reformat the >> disks. >> > > >
+
Steve Loughran 2011-06-17, 09:30
-
Re: Upgrading namenode/secondary node hardware
MilleBii 2011-06-17, 13:48
I see it is not so obvious and potentially dangerous so I will be learning & experimenting first. Thx for the tip.
2011/6/17 Steve Loughran <[EMAIL PROTECTED]>
> On 16/06/11 14:19, MilleBii wrote: > >> But if my Filesystem is up& running fine... do I have to worry at all or >> will the copy (ftp transfer) of hdfs will be enough. >> >> > I'm not going to make any predictions there as if/when things go wrong > > -you do need to shut down the FS before the move > -you ought to get the edit logs replayed before the move > -you may want to try experimenting with copying the namenode data and > bringing up the namenode (without any datanodes connected to, so it comes up > in safe mode), to make sure everything works. > > I'd also worry that if you aren't familiar with the edit log, you may need > to spend some time learning the subtle details of namenode journalling, > replaying, backup and restoration, and what the secondary namenode does. > It's easy to bring up a cluster and get overconfident that it works, right > up to the moment it stops working. Experiment with your cluster's and teams' > failure handling before you really need it > > > >> 2011/6/16 Steve Loughran<[EMAIL PROTECTED]> >> >> On 15/06/11 15:54, MilleBii wrote: >>> >>> Thx. >>>> >>>> #1 don't understand the "edit logs" remark. >>>> >>>> >>> well, that's something you need to work on as its the key to keeping your >>> cluster working. The edit log is the journal of changes made to a >>> namenode, >>> which gets streamed to HDD and your secondary Namenode. After a NN >>> restart, >>> it has to replay all changes since the last checkpoint to get its >>> directory >>> structure up to date. Lose the edit log and you may as well reformat the >>> disks. >>> >>> >> >> >> > -- -MilleBii-
+
MilleBii 2011-06-17, 13:48
-
Re: Upgrading namenode/secondary node hardware
Allen Wittenauer 2011-06-20, 06:25
On Jun 15, 2011, at 3:18 AM, Steve Loughran wrote: > yes, put it in the same place on your HA storage and you may not even need to reconfigure it. If you didn't shut down the filesystem cleanly, you'll need to replay the edit logs. As a sidenote...
Lots of weird incompatibilities have snuck into the code with the editslog between versions. You REALLY REALLY REALLY don't want to let editslog get processed by a different version.
Shutdown the NN, DNs, etc. Then start the NN up to let it process the editslog. Then shutdown the NN again with a clean editslog and do the upgrade.
+
Allen Wittenauer 2011-06-20, 06:25
|
|