|
|
-
Merging Namenode Federation feature (HDFS-1052) to trunk
Suresh Srinivas 2011-03-03, 22:41
We have started pushing changes for namenode federation in to the feature branch HDFS-1052. The work items are created as subtask of the jira HDFS-1052 and are based on the design document published in the same jira. By the end of this week, we will complete pushing the changes to HDFS-1052 branch. Though the changes in these jiras are already committed, please do provide your feedback on either HDFS-1052 or its subtasks. New items that come out of the feedback will be addressed in new jiras.
Current status of the development: # The testing of this feature is underway. Most of the basic functionality has been tested both for a single namenode cluster (for backward compatibility) and with multiple namenodes. # All the existing tests and newly added tests pass (same as trunk).
We plan on merging this branch to trunk after a week or two. This will help us continue make future changes on the trunk. I will send an announcement before merging the federation branch into trunk.
Regards, Suresh
+
Suresh Srinivas 2011-03-03, 22:41
-
Re: Merging Namenode Federation feature (HDFS-1052) to trunk
Allen Wittenauer 2011-03-12, 16:43
On Mar 3, 2011, at 2:41 PM, Suresh Srinivas wrote:
> We have started pushing changes for namenode federation in to the feature branch HDFS-1052. The work items are created as subtask of the jira HDFS-1052 and are based on the design document published in the same jira. By the end of this week, we will complete pushing the changes to HDFS-1052 branch. Though the changes in these jiras are already committed, please do provide your feedback on either HDFS-1052 or its subtasks. New items that come out of the feedback will be addressed in new jiras.
> > Current status of the development: > # The testing of this feature is underway. Most of the basic functionality has been tested both for a single namenode cluster (for backward compatibility) and with multiple namenodes. > # All the existing tests and newly added tests pass (same as trunk). > > We plan on merging this branch to trunk after a week or two. This will help us continue make future changes on the trunk. I will send an announcement before merging the federation branch into trunk. >
It sounds like merging into trunk is extremely premature. That said, I'm still trying to understand the why's around this.
To me, this series of changes looks like it is going to make running a grid much much harder for very little benefit. In particular, I don't see the difference between running multiple NN/DN combinations verses running federation, especially with client side mount tables in play.
+
Allen Wittenauer 2011-03-12, 16:43
-
Re: Merging Namenode Federation feature (HDFS-1052) to trunk
Konstantin Shvachko 2011-03-14, 17:28
Allen is right. This is a huge new feature with 86 jiras already filed, which substantially increases the complexity of the code base. Having an in-depth motivation and benchmarking will be needed before the community decides on adopting it for support. Thanks, --Konstantin
On Sat, Mar 12, 2011 at 8:43 AM, Allen Wittenauer <[EMAIL PROTECTED]>wrote:
> > On Mar 3, 2011, at 2:41 PM, Suresh Srinivas wrote: > > > We have started pushing changes for namenode federation in to the feature > branch HDFS-1052. The work items are created as subtask of the jira > HDFS-1052 and are based on the design document published in the same jira. > By the end of this week, we will complete pushing the changes to HDFS-1052 > branch. Though the changes in these jiras are already committed, please do > provide your feedback on either HDFS-1052 or its subtasks. New items that > come out of the feedback will be addressed in new jiras. > > > > > Current status of the development: > > # The testing of this feature is underway. Most of the basic > functionality has been tested both for a single namenode cluster (for > backward compatibility) and with multiple namenodes. > > # All the existing tests and newly added tests pass (same as trunk). > > > > We plan on merging this branch to trunk after a week or two. This will > help us continue make future changes on the trunk. I will send an > announcement before merging the federation branch into trunk. > > > > It sounds like merging into trunk is extremely premature. That > said, I'm still trying to understand the why's around this. > > To me, this series of changes looks like it is going to make running > a grid much much harder for very little benefit. In particular, I don't see > the difference between running multiple NN/DN combinations verses running > federation, especially with client side mount tables in play. > >
+
Konstantin Shvachko 2011-03-14, 17:28
-
Re: Merging Namenode Federation feature (HDFS-1052) to trunk
Dhruba Borthakur 2011-03-14, 17:43
Hi folks, The design for the federation work has been a published and there is a very well-written design document. It explains the pros-and-cons of each design point. It would be nice if more people can review this document and provide comments on how to make it better. The implementation is in progress but that does not mean that the "design-is-cast-in-stone-and-cannot-be-enhanced". Allen: can you pl describe what you mean by "It sounds like merging into trunk is extremely premature". If we can make all unit tests pass successfully on the branch, then do you think we should merge that branch into the trunk? Konstantin: I agree that federation introduces new code complexity. But it is a fact that introducing a new heavy-weight feature will add complexity. If you have a different proposal (and implementation) to scale namenode, please share it with us and we can then evaluate these designs in terms on complexity/feature. If you have questions about certain issues in the design, it would be great if you can ask them now. Hopefully, the folks doing the implementation can then provide you performance numbers to alleviate your concerns. >From that way I look at it, I think the federation-feature is a huge positive step in the right direction. thanks, dhruba On Mon, Mar 14, 2011 at 10:28 AM, Konstantin Shvachko <[EMAIL PROTECTED]>wrote: > Allen is right. > This is a huge new feature with 86 jiras already filed, which substantially > increases the complexity of the code base. > Having an in-depth motivation and benchmarking will be needed before the > community decides on adopting it for support. > Thanks, > --Konstantin > > > > On Sat, Mar 12, 2011 at 8:43 AM, Allen Wittenauer > <[EMAIL PROTECTED]>wrote: > > > > > On Mar 3, 2011, at 2:41 PM, Suresh Srinivas wrote: > > > > > We have started pushing changes for namenode federation in to the > feature > > branch HDFS-1052. The work items are created as subtask of the jira > > HDFS-1052 and are based on the design document published in the same > jira. > > By the end of this week, we will complete pushing the changes to > HDFS-1052 > > branch. Though the changes in these jiras are already committed, please > do > > provide your feedback on either HDFS-1052 or its subtasks. New items that > > come out of the feedback will be addressed in new jiras. > > > > > > > > Current status of the development: > > > # The testing of this feature is underway. Most of the basic > > functionality has been tested both for a single namenode cluster (for > > backward compatibility) and with multiple namenodes. > > > # All the existing tests and newly added tests pass (same as trunk). > > > > > > We plan on merging this branch to trunk after a week or two. This will > > help us continue make future changes on the trunk. I will send an > > announcement before merging the federation branch into trunk. > > > > > > > It sounds like merging into trunk is extremely premature. That > > said, I'm still trying to understand the why's around this. > > > > To me, this series of changes looks like it is going to make > running > > a grid much much harder for very little benefit. In particular, I don't > see > > the difference between running multiple NN/DN combinations verses running > > federation, especially with client side mount tables in play. > > > > > -- Connect to me at http://www.facebook.com/dhruba
+
Dhruba Borthakur 2011-03-14, 17:43
-
Re: Merging Namenode Federation feature (HDFS-1052) to trunk
Konstantin Shvachko 2011-03-15, 01:12
Dhruba, good you are speaking up for federation. I consider it important as it means more support for the feature in the future.
The purpose of my reply was to get this discussion going, as I found Allens question unanswered for 2 weeks. The concern he has seems legitimate to me. If ops think federation will "make running a grid much much harder" I want to know why and how much harder. Because cluster "manageability" is claimed as one of the objectives of federation.
I sure am well familiar with the design being a part of it for a while. And all my concerns have been articulated and well known. Though not all of them are addressed.
The way I see it now, Federation introduces - lots of code complexity to the system - harder manageability, according to Allen - potential performance degradation (tbd) And the main question for those 95% of users, who don't run large clusters or don't want to place all their compute resources in one data center, is what is the advantage in supporting it?
Performance-wise there 2 main aspects: - Does federation give me the same cluster performance if I don't federate? - If I federate how much more throughput can I get?
Thanks, --Konstantin
On Mon, Mar 14, 2011 at 10:43 AM, Dhruba Borthakur <[EMAIL PROTECTED]> wrote:
> Hi folks, > > The design for the federation work has been a published and there is a very > well-written design document. It explains the pros-and-cons of each design > point. It would be nice if more people can review this document and provide > comments on how to make it better. The implementation is in progress but > that does not mean that the > "design-is-cast-in-stone-and-cannot-be-enhanced". > > Allen: can you pl describe what you mean by "It sounds like merging into > trunk is extremely premature". If we can make all unit tests pass > successfully on the branch, then do you think we should merge that branch > into the trunk? > > Konstantin: I agree that federation introduces new code complexity. But it > is a fact that introducing a new heavy-weight feature will add complexity. > If you have a different proposal (and implementation) to scale namenode, > please share it with us and we can then evaluate these designs in terms on > complexity/feature. If you have questions about certain issues in the > design, it would be great if you can ask them now. Hopefully, the folks > doing the implementation can then provide you performance numbers to > alleviate your concerns. > > From that way I look at it, I think the federation-feature is a huge > positive step in the right direction. > > thanks, > dhruba > > > > > On Mon, Mar 14, 2011 at 10:28 AM, Konstantin Shvachko < > [EMAIL PROTECTED]> wrote: > >> Allen is right. >> This is a huge new feature with 86 jiras already filed, which >> substantially >> increases the complexity of the code base. >> Having an in-depth motivation and benchmarking will be needed before the >> community decides on adopting it for support. >> Thanks, >> --Konstantin >> >> >> >> On Sat, Mar 12, 2011 at 8:43 AM, Allen Wittenauer >> <[EMAIL PROTECTED]>wrote: >> >> > >> > On Mar 3, 2011, at 2:41 PM, Suresh Srinivas wrote: >> > >> > > We have started pushing changes for namenode federation in to the >> feature >> > branch HDFS-1052. The work items are created as subtask of the jira >> > HDFS-1052 and are based on the design document published in the same >> jira. >> > By the end of this week, we will complete pushing the changes to >> HDFS-1052 >> > branch. Though the changes in these jiras are already committed, please >> do >> > provide your feedback on either HDFS-1052 or its subtasks. New items >> that >> > come out of the feedback will be addressed in new jiras. >> > >> > > >> > > Current status of the development: >> > > # The testing of this feature is underway. Most of the basic >> > functionality has been tested both for a single namenode cluster (for >> > backward compatibility) and with multiple namenodes. >> > > # All the existing tests and newly added tests pass (same as trunk).
+
Konstantin Shvachko 2011-03-15, 01:12
-
Re: Merging Namenode Federation feature (HDFS-1052) to trunk
Travis Crawford 2011-03-15, 05:36
On Mon, Mar 14, 2011 at 6:12 PM, Konstantin Shvachko <[EMAIL PROTECTED]> wrote: > Dhruba, good you are speaking up for federation. > I consider it important as it means more support for the feature in the > future. > > The purpose of my reply was to get this discussion going, as I found Allens > question unanswered for 2 weeks. > The concern he has seems legitimate to me. If ops think federation will > "make running a grid much much harder" I want to know why and how much > harder. > Because cluster "manageability" is claimed as one of the objectives of > federation. > > I sure am well familiar with the design being a part of it for a while. > And all my concerns have been articulated and well known. Though not all of > them are addressed. > > The way I see it now, Federation introduces > - lots of code complexity to the system > - harder manageability, according to Allen > - potential performance degradation (tbd) > And the main question for those 95% of users, who don't run large clusters > or > don't want to place all their compute resources in one data center, is what > is the advantage in supporting it? > > Performance-wise there 2 main aspects: > - Does federation give me the same cluster performance if I don't federate? > - If I federate how much more throughput can I get? > This reminds me of multi-cell GFS (discussed by Quinlan & McKusick at http://bit.ly/einKMn). I used to run some of those clusters, and compared to standard single-master clusters of course they were more complex to manage. However, if you have apps needing that much master capacity & that much shared read+write bandwidth across large pools of storage nodes, its worth the trouble. Assuming most people don't use federation it shouldn't add complexity in the common case, but opens up some needed capabilities for large sites. Stuff like datanode management would become more challenging in a multi-master environment, but that's where automation comes in. If you don't have teams building tools to manage your datacenter, its likely you don't need federation either. I'm currently running a handful of HDFS clusters & my overall reaction to federation is "that's cool, but I probably won't need it for a few years." Seems like the sort of thing the vast majority of sites won't even encounter - you'd just add datanodes to one master & start using it. --travis > Thanks, > --Konstantin > > On Mon, Mar 14, 2011 at 10:43 AM, Dhruba Borthakur <[EMAIL PROTECTED]> wrote: > >> Hi folks, >> >> The design for the federation work has been a published and there is a very >> well-written design document. It explains the pros-and-cons of each design >> point. It would be nice if more people can review this document and provide >> comments on how to make it better. The implementation is in progress but >> that does not mean that the >> "design-is-cast-in-stone-and-cannot-be-enhanced". >> >> Allen: can you pl describe what you mean by "It sounds like merging into >> trunk is extremely premature". If we can make all unit tests pass >> successfully on the branch, then do you think we should merge that branch >> into the trunk? >> >> Konstantin: I agree that federation introduces new code complexity. But it >> is a fact that introducing a new heavy-weight feature will add complexity. >> If you have a different proposal (and implementation) to scale namenode, >> please share it with us and we can then evaluate these designs in terms on >> complexity/feature. If you have questions about certain issues in the >> design, it would be great if you can ask them now. Hopefully, the folks >> doing the implementation can then provide you performance numbers to >> alleviate your concerns. >> >> From that way I look at it, I think the federation-feature is a huge >> positive step in the right direction. >> >> thanks, >> dhruba >> >> >> >> >> On Mon, Mar 14, 2011 at 10:28 AM, Konstantin Shvachko < >> [EMAIL PROTECTED]> wrote: >> >>> Allen is right. >>> This is a huge new feature with 86 jiras already filed, which
+
Travis Crawford 2011-03-15, 05:36
-
Re: Merging Namenode Federation feature (HDFS-1052) to trunk
suresh srinivas 2011-03-15, 06:19
Thanks for starting off the discussion.
> This is a huge new feature with 86 jiras already filed, which substantially increases the complexity of the code base. These are 86 jiras file in a feature branch. We decided to make these changes, in smaller increments, instead of a jumbo patch. This was done in good faith, as community did not want a jumbo patch (as seen in several discussions), to make reviewing of the patch easy and to record the changes for reference. Main changes have gone in a few jiras. Others are mainly fixing the test failures, adding tests and fixing bugs introduced during development. Please review the patch and provide feed back; we will address the concerns.
> Having an in-depth motivation and benchmarking will be needed before the community decides on adopting it for support. This comes as a surprise, especially from Konstantin :-). The first part of the proposal and design both cover motivation.
As regards to benchmarking - if you see the design, there is no big change in i/o subsystem. Most of the changes are in the organization of storage to introduce block pools, block pool ID, a thread per namenode in datanode, upgrade/rollback. Not sure what concerns you have as regards to benchmarking. So far our tests show no difference with federation.
As we developed this feature, some significant improvements have been made to the system - fast snapshots (snapshot time down from 1hr 45 mins to 1 min!), fast startup, cleanup of storage, fixing multi threading issues in several places, decommissioning improvements etc.
> The purpose of my reply was to get this discussion going, as I found Allens question unanswered for 2 weeks. My email was sent on March 3rd. Allen's email was sent on March 12th.
> The concern he has seems legitimate to me. If ops think federation will "make running a grid much much harder" I want to know why and how much harder. I would like to understand the concerns here. Allen please add details.
> The way I see it now, Federation introduces > - lots of code complexity to the system > - harder manageability, according to Allen > - potential performance degradation (tbd) I have addressed these already.
> And the main question for those 95% of users, who don't run large clusters or don't want to place all their compute resources in one data center, is what is the advantage in supporting it? This is a valid concern. Hence the single namenode configuration that most installations run today, will run as is. We put a lot of development and testing effort to ensure this.
Regards, Suresh
+
suresh srinivas 2011-03-15, 06:19
-
Re: Merging Namenode Federation feature (HDFS-1052) to trunk
Konstantin Shvachko 2011-03-16, 22:52
On Mon, Mar 14, 2011 at 11:19 PM, suresh srinivas <[EMAIL PROTECTED]>wrote:
> Thanks for starting off the discussion. > > > This is a huge new feature with 86 jiras already filed, which > substantially increases the complexity of the code base. > These are 86 jiras file in a feature branch. We decided to make these > changes, in smaller increments, instead of a jumbo patch. This was done in > good faith, as community did not want a jumbo patch (as seen in several > discussions), to make reviewing of the patch easy and to record the changes > for reference. >
Thanks for doing it that way. > > Having an in-depth motivation and benchmarking will be needed before the > community decides on adopting it for support. > This comes as a surprise, especially from Konstantin :-). The first part of > the proposal and design both cover motivation. >
That is a different motivation. The document talks about why you should use federation. I am asking about motivation of supporting the code base while not using it. At least this is how understand Allen's question and some of my colleagues'.
So far our tests show no difference with federation. >
This is exactly what is needed. If you could put some numbers in the jira for the reference.
Also it is interesting to know whether there is a benefit in splitting the namespace. Can I e.g. do more getBlockLocations per second? This is one of the aspects of scaling, right? > As we developed this feature, some significant improvements have been made > to the system - fast snapshots (snapshot time down from 1hr 45 mins to 1 > min!), fast startup, cleanup of storage, fixing multi threading issues in > several places, decommissioning improvements etc. >
This is motivation. I am glad I asked. > > The purpose of my reply was to get this discussion going, as I found > Allens question unanswered for 2 weeks. > My email was sent on March 3rd. Allen's email was sent on March 12th. >
Sorry, my bad. > > The concern he has seems legitimate to me. If ops think federation will > "make running a grid much much harder" I want to know why and how much > harder. > I would like to understand the concerns here. Allen please add details. > > > The way I see it now, Federation introduces > > - lots of code complexity to the system > > - harder manageability, according to Allen > > - potential performance degradation (tbd) > I have addressed these already. > > > And the main question for those 95% of users, who don't run large > clusters > or don't want to place all their compute resources in one data center, is > what is the advantage in supporting it? > This is a valid concern. Hence the single namenode configuration that most > installations run today, will run as is. We put a lot of development and > testing effort to ensure this. >
I don't know what you mean by "as is". My experience with this word in real estate tells me it can be anything.
+
Konstantin Shvachko 2011-03-16, 22:52
-
Re: Merging Namenode Federation feature (HDFS-1052) to trunk
suresh srinivas 2011-03-16, 23:54
That is a different motivation. The document talks about why you should use > federation. I am asking about motivation of supporting the code base while > not using it. At least this is how understand Allen's question and some of > my colleagues'. > Namenode code is not changed at all. Datanode code changes to add the notion of block pool and a thread per NN. For a single NN, datanode is equivalent to the current datanode. If you argue that there should not be any code change - not sure how features like this can be added to HDFS. There is no change from user perspective and performance of the system. No additional complexity from the existing system. > If you could put some numbers in the jira for the reference. > Will do. > > Also it is interesting to know whether there is a benefit in splitting > the namespace. Can I e.g. do more getBlockLocations per second? > This is one of the aspects of scaling, right? > I do not understand your question. This feature does not scale getBlockLocations per second for a single NN. When you use many NNs, total requests per second does scale for the entire cluster. > As we developed this feature, some significant improvements have been made > to the system - fast snapshots (snapshot time down from 1hr 45 mins to 1 > min!), fast startup, cleanup of storage, fixing multi threading issues in > several places, decommissioning improvements etc. > > This is a valid concern. Hence the single namenode configuration that most > > installations run today, will run as is. We put a lot of development and > > testing effort to ensure this. > > > > I don't know what you mean by "as is". My experience with this word in real > estate tells me it can be anything. > I used the word with following meaning: http://www.merriam-webster.com/dictionary/as%20is— *as is* *:* in the presently existing condition without modification
+
suresh srinivas 2011-03-16, 23:54
-
Re: Merging Namenode Federation feature (HDFS-1052) to trunk
suresh srinivas 2011-03-16, 23:55
> Namenode code is not changed at all. Want to make sure I qualify this right. The change is not significant, other than notion of BPID that the NN uses is added.
+
suresh srinivas 2011-03-16, 23:55
-
Re: Merging Namenode Federation feature (HDFS-1052) to trunk
Sanjay Radia 2011-03-14, 17:57
On Mar 12, 2011, at 8:43 AM, Allen Wittenauer wrote:
> > On Mar 3, 2011, at 2:41 PM, Suresh Srinivas wrote: > >> We have started pushing changes for namenode federation in to the >> feature branch HDFS-1052. The work items are created as subtask of >> the jira HDFS-1052 and are based on the design document published >> in the same jira. By the end of this week, we will complete pushing >> the changes to HDFS-1052 branch. Though the changes in these jiras >> are already committed, please do provide your feedback on either >> HDFS-1052 or its subtasks. New items that come out of the feedback >> will be addressed in new jiras. > >> >> Current status of the development: >> # The testing of this feature is underway. Most of the basic >> functionality has been tested both for a single namenode cluster >> (for backward compatibility) and with multiple namenodes. >> # All the existing tests and newly added tests pass (same as trunk). >> >> We plan on merging this branch to trunk after a week or two. This >> will help us continue make future changes on the trunk. I will send >> an announcement before merging the federation branch into trunk. >> > > It sounds like merging into trunk is extremely premature. That > said, I'm still trying to understand the why's around this. > > To me, this series of changes looks like it is going to make > running a grid much much harder for very little benefit. In > particular, I don't see the difference between running multiple NN/ > DN combinations verses running federation, especially with client > side mount tables in play.
Main difference between independent HDFS clusters and HDFS federation is that in federation one can shares the storage of the DNs and the DNs. There is a very detailed document that describes this on the Jira.
If you are running a single NN and you don't need the scaling then running and managing hadoop is for all practical purposes unchanged. sanjay >
+
Sanjay Radia 2011-03-14, 17:57
-
Re: Merging Namenode Federation feature (HDFS-1052) to trunk
Sanjay Radia 2011-03-21, 23:08
On Mar 14, 2011, at 10:57 AM, Sanjay Radia wrote:
> > On Mar 12, 2011, at 8:43 AM, Allen Wittenauer wrote: > >> >> To me, this series of changes looks like it is going to make >> running a grid much much harder for very little benefit. In >> particular, I don't see the difference between running multiple NN/ >> DN combinations verses running federation, especially with client >> side mount tables in play. > > > > Main difference between independent HDFS clusters and HDFS federation > is that in federation one can shares the storage of the DNs and the > DNs. > There is a very detailed document that describes this on the Jira. > > If you are running a single NN and you don't need the scaling then > running and managing hadoop is for all practical purposes unchanged. > > > sanjay >> > Allen, not sure if I explained the difference above. Base on the discussion we had at the Hug, I want to clarify a few things
In federation the NNs and the DNs are part of a cluster. It is not as if a data node is willing to store blocks for any NN anywhere in the data center. We still expect a data center to have multiple hadoop clusters each with a set of data nodes and each cluster with 1 or more NNs. A DN stores block for only ONE cluster.
You had asked about how one debugs a corrupt file or corrupt block. In the old world a file's inode contains the block ids of its blocks. There is also a mapping from block id to block location (ie which DN). In the federated hdfs, each block is identified by a longer block id, called the extended block id= blockPool Id + block id. A block pool is owned by only ONE NN. Hence if you are trying to locate a block then you map the extended block id to the block location (ie DN) - this is the same as before, except that the identifier of the block is merely longer.
If you are trying to debug from the point of view of the DN: In federated HDFS, the blocks stored in the DN are segregated in directories by the blockPool Id. The block pool id can be mapped to a NN since each Block pool has only ONE owner. Hence to map from a block to a particular NN is easy - the first part of the Block's longer identifier will tell you which NN owns that block. sanjay
+
Sanjay Radia 2011-03-21, 23:08
-
Re: Merging Namenode Federation feature (HDFS-1052) to trunk
Brian Bockelman 2011-03-21, 23:25
On Mar 21, 2011, at 6:08 PM, Sanjay Radia wrote: > > On Mar 14, 2011, at 10:57 AM, Sanjay Radia wrote: > >> >> On Mar 12, 2011, at 8:43 AM, Allen Wittenauer wrote: >> >>> >>> To me, this series of changes looks like it is going to make >>> running a grid much much harder for very little benefit. In >>> particular, I don't see the difference between running multiple NN/ >>> DN combinations verses running federation, especially with client >>> side mount tables in play. >> >> >> >> Main difference between independent HDFS clusters and HDFS federation >> is that in federation one can shares the storage of the DNs and the DNs. >> There is a very detailed document that describes this on the Jira. >> >> If you are running a single NN and you don't need the scaling then >> running and managing hadoop is for all practical purposes unchanged. >> >> >> sanjay >>> >> > > > Allen, not sure if I explained the difference above. > Base on the discussion we had at the Hug, I want to clarify a few things > > In federation the NNs and the DNs are part of a cluster. It is not as if a data node is willing to store blocks for any NN anywhere in the data center. > We still expect a data center to have multiple hadoop clusters each with a set of data nodes and each cluster with 1 or more NNs. > A DN stores block for only ONE cluster. A few questions: - Do we have a clear definition for a cluster? - With the above definition, is it an error if not all DNs belong to the same set of NNs? - With the working definition of a cluster, what namespace guarantees are given to clients? The reason I ask is not because I oppose the idea of federations, but rather am curious of about the terminology and how it's 'advertised' to the user. I rather like the design; it has similar ideas to a NSF project I've seen ( http://www.reddnet.org/). > > You had asked about how one debugs a corrupt file or corrupt block. > In the old world a file's inode contains the block ids of its blocks. There is also a mapping from block id to block location (ie which DN). > In the federated hdfs, each block is identified by a longer block id, called the extended block id= blockPool Id + block id. > A block pool is owned by only ONE NN. > Hence if you are trying to locate a block then you map the extended block id to the block location (ie DN) - this is the same as before, except that the identifier > of the block is merely longer. > > If you are trying to debug from the point of view of the DN: > In federated HDFS, the blocks stored in the DN are segregated in directories by the blockPool Id. > The block pool id can be mapped to a NN since each Block pool has only ONE owner. > Hence to map from a block to a particular NN is easy - the first part of the Block's longer identifier will tell you which NN owns that block. > This sounds good. Brian
+
Brian Bockelman 2011-03-21, 23:25
-
Re: Merging Namenode Federation feature (HDFS-1052) to trunk
suresh srinivas 2011-03-24, 09:28
> > > A few questions: > - Do we have a clear definition for a cluster? >
Cluster before federation is defined by list of datanodes in include file, bound together by namespaceID of the namenode that these nodes bind to on first registration with the namenode. In essence, namespaceID defines the cluster nodes.
In federation cluster namenodes are setup with the same clusterID. ClusterID is established at the datanodes when they first register with a namenode. So nodes with the same clusterID are part of the cluster.
- With the above definition, is it an error if not all DNs belong to the > same set of NNs? > A DN has to belong to same set of NNs sharing the same clusterID. DNs cannot register with a namenode that has a different clusterID. > - With the working definition of a cluster, what namespace guarantees are > given to clients? > I am not sure what you mean by this.
> >
+
suresh srinivas 2011-03-24, 09:28
-
Re: Merging Namenode Federation feature (HDFS-1052) to trunk
Allen Wittenauer 2011-03-22, 20:11
On Mar 21, 2011, at 4:08 PM, Sanjay Radia wrote: > > Allen, not sure if I explained the difference above. > Base on the discussion we had at the Hug, I want to clarify a few things
Thanks for taking the time at HUG. (I've since figured out that I lost your messages as part of my email list transition.)
> A DN stores block for only ONE cluster. But this does make things easier. Although I'm still fairly confident that it adds too much complexity for little gain though. So put this in the 'agree to disagree' column. It would still be nice if you guys could lay off the camelCase options though. Admins hate the shift key.
BTW, Robert C. asked what I thought you guys should have been working on instead of Federation. I told him (and you) high availability of the namenode (which I still believe is necessary for HDFS in more and more cases), but I've had more time to think about it. So expect my list (which I'll post here) soon. :p
+
Allen Wittenauer 2011-03-22, 20:11
-
Re: Merging Namenode Federation feature (HDFS-1052) to trunk
suresh srinivas 2011-03-24, 09:34
> > But this does make things easier. Although I'm still fairly > confident that it adds too much complexity for little gain though. Allen,can you please add details on what complexity you are talking about here? (I have already asked this question many times)
>From code perspective it is not adding complexity, as I have explained before.
You could chose to run the cluster with single namenode and not see much difference. But federation does solve in our case complicated setting up of multiple clusters, balancing the storage across the clusters, lack of single view and duplication of data.
So put this in the 'agree to disagree' column. It would still be nice if > you guys could lay off the camelCase options though. Admins hate the shift > key. >
I did reply to your comment saying the options are case insensitive.
> > BTW, Robert C. asked what I thought you guys should have been > working on instead of Federation. I told him (and you) high availability of > the namenode (which I still believe is necessary for HDFS in more and more > cases), but I've had more time to think about it. So expect my list (which > I'll post here) soon. :p > > Federation is solving an important problem for us. We are looking at HA, as you might have seen in some of the jira activities.
+
suresh srinivas 2011-03-24, 09:34
|
|