|
Chris Douglas
2010-08-06, 21:02
Doug Cutting
2010-08-06, 21:47
Arun C Murthy
2010-08-08, 19:21
Devaraj Das
2010-08-08, 19:50
Hemanth Yamijala
2010-08-09, 04:42
Steve Loughran
2010-08-09, 11:07
Owen O'Malley
2010-08-09, 16:23
Doug Cutting
2010-08-09, 16:26
Konstantin Shvachko
2010-08-10, 00:40
Jeff Hammerbacher
2010-08-10, 16:44
Steve Loughran
2010-08-16, 09:56
Arun C Murthy
2010-08-16, 14:27
Eli Collins
2010-08-10, 17:24
Arun C Murthy
2010-08-11, 05:04
Vinod KV
2010-08-11, 05:35
Tom White
2010-08-11, 15:51
Chris Douglas
2010-08-14, 02:41
Stack
2010-08-07, 04:45
Tom White
2010-08-09, 16:21
Nigel Daley
2010-08-11, 04:02
Konstantin Boudnik
2010-08-09, 18:47
Dhruba Borthakur
2010-08-14, 01:53
|
-
[DISCUSSION] Combine committer lists in Common/HDFS/MapReduceChris Douglas 2010-08-06, 21:02
Hadoop developers tend to specialize in either HDFS or MapReduce, but
given that: 0) Granting karma to Common is routine for a committer in either space; there are no Common-only committers 1) The majority of committers have been grandfathered into committer roles in all three projects 2) Many patches to Common require corresponding commits to both HDFS and MapReduce 3) Review-then-commit is usually sufficient notice for interested parties to comment 4) There have been few problems with committers pushing in patches without consulting someone more directly involved 5) Everyone on the PMC gets commit rights to all subprojects, anyway Perhaps it would make sense to give up on separate committer roles until the projects are separate TLPs. On the other hand: 0) Nobody has been independently added to both HDFS and MapReduce since the projects were separated 1) It could exacerbate the focus on MapReduce in HDFS, at the expense of other projects (like HBase). 2) HDFS and MapReduce are mostly independent communities and codebases; expertise in one does not imply fluency in the other 3) Granting veto power across projects can lead to deadlock despite consensus within that community 4) TLP status for either project may require untangling HDFS/MR roles that could be distinguished now Personally, I'm in favor of combining the roles. I trust all six of the committers made since the project split no less than those made earlier. Further, version control is sufficient for recovering from most, foreseeable issues. I have some concerns about "harmless" commits pushed through without an audit by the subproject's maintainers (a few in recent memory caused downtime in Y! clusters), but combining the roles seems like a worthwhile experiment. Thoughts? -C +
Chris Douglas 2010-08-06, 21:02
-
Re: [DISCUSSION] Combine committer lists in Common/HDFS/MapReduceDoug Cutting 2010-08-06, 21:47
On 08/06/2010 02:02 PM, Chris Douglas wrote:
> Thoughts? To my thinking, a single set of committers should manage a product. A product is something that's released. Currently, we still branch and release Common, HDFS and MapReduce together, so I regard them as a single product and hence believe they should have a single set of committers. If/when we start to release them separately then we can consider splitting committer lists. Even then, we don't have to split them, since a single set of committers can manage multiple products. In fact, I don't see a strong case for splitting committer lists until these become separate TLPs. (The other Hadoop subprojects with separate committer lists ought to become TLPs, but that's a separate discussion.) Doug +
Doug Cutting 2010-08-06, 21:47
-
Re: [DISCUSSION] Combine committer lists in Common/HDFS/MapReduceArun C Murthy 2010-08-08, 19:21
On Aug 6, 2010, at 2:47 PM, Doug Cutting wrote: > On 08/06/2010 02:02 PM, Chris Douglas wrote: >> Thoughts? > > To my thinking, a single set of committers should manage a product. A > product is something that's released. Currently, we still branch and > release Common, HDFS and MapReduce together, so I regard them as a > single product and hence believe they should have a single set of > committers. This of course begs a larger question - should we just merge Common, HDFS & Map-Reduce together and be done with? Arun +
Arun C Murthy 2010-08-08, 19:21
-
Re: [DISCUSSION] Combine committer lists in Common/HDFS/MapReduceDevaraj Das 2010-08-08, 19:50
Sent from my iPhone On Aug 8, 2010, at 12:23 PM, "Arun C Murthy" <[EMAIL PROTECTED]> wrote: > > On Aug 6, 2010, at 2:47 PM, Doug Cutting wrote: > >> On 08/06/2010 02:02 PM, Chris Douglas wrote: >>> Thoughts? >> >> To my thinking, a single set of committers should manage a >> product. A >> product is something that's released. Currently, we still branch and >> release Common, HDFS and MapReduce together, so I regard them as a >> single product and hence believe they should have a single set of >> committers. > > This of course begs a larger question - should we just merge Common, > HDFS & Map-Reduce together and be done with? Good point Arun. I am in support of this. > > Arun +
Devaraj Das 2010-08-08, 19:50
-
Re: [DISCUSSION] Combine committer lists in Common/HDFS/MapReduceHemanth Yamijala 2010-08-09, 04:42
>
> On Aug 6, 2010, at 2:47 PM, Doug Cutting wrote: > >> On 08/06/2010 02:02 PM, Chris Douglas wrote: >>> >>> Thoughts? >> >> To my thinking, a single set of committers should manage a product. A >> product is something that's released. Currently, we still branch and >> release Common, HDFS and MapReduce together, so I regard them as a >> single product and hence believe they should have a single set of >> committers. > > This of course begs a larger question - should we just merge Common, HDFS & > Map-Reduce together and be done with? I actually think that would simplify matters *smile* Thanks hemanth +
Hemanth Yamijala 2010-08-09, 04:42
-
Re: [DISCUSSION] Combine committer lists in Common/HDFS/MapReduceSteve Loughran 2010-08-09, 11:07
On 09/08/10 05:42, Hemanth Yamijala wrote:
>> >> On Aug 6, 2010, at 2:47 PM, Doug Cutting wrote: >> >>> On 08/06/2010 02:02 PM, Chris Douglas wrote: >>>> >>>> Thoughts? >>> >>> To my thinking, a single set of committers should manage a product. A >>> product is something that's released. Currently, we still branch and >>> release Common, HDFS and MapReduce together, so I regard them as a >>> single product and hence believe they should have a single set of >>> committers. >> >> This of course begs a larger question - should we just merge Common, HDFS& >> Map-Reduce together and be done with? > > I actually think that would simplify matters *smile* I seem to recall the original goal of the split was to make it easier for people to stay current with the dev of their particular bit, but I'm not sure that goal has been met -more dev mailing lists, all of which I'm behind on -more user lists, messages get sent to more of them or its not clear where -its very hard to push out manageability changes across everything -no clear push for a desynchronised release process -testing of everything is very convoluted common, hdfs and mapreduce are very tightly coupled, in code and in releases. I don't see the split as having helped, it just seems to have slowed the rate of release, and that's been bad for everyone. -steve +
Steve Loughran 2010-08-09, 11:07
-
Re: [DISCUSSION] Combine committer lists in Common/HDFS/MapReduceOwen O'Malley 2010-08-09, 16:23
On Aug 8, 2010, at 12:21 PM, Arun C Murthy wrote: > This of course begs a larger question - should we just merge Common, > HDFS & Map-Reduce together and be done with? *Sigh* I wish we'd just split the mailing lists, source code, and artifacts (jars, documentation) and left it in a single project. Basically, we should have set it up as: http://svn.apache.org/repos/asf/hadoop/trunk/{common,hdfs,mapreduce} It would certainly make it easier to deal with cross-project patches. But, that said, I'm not wild about going through another cutoff where we lose all of our history in git. (Subversion tracks through the split but the svn to git gateway views them as new files.) I really wish we could just move to git for commits as well as for reads. I'm also concerned about spinning for the sake of spinning. Doing more major code movement will cause more instability in trunk that I'm not sure is a good idea without clear goals and roadmaps. -- Owen +
Owen O'Malley 2010-08-09, 16:23
-
Re: [DISCUSSION] Combine committer lists in Common/HDFS/MapReduceDoug Cutting 2010-08-09, 16:26
On 08/08/2010 12:21 PM, Arun C Murthy wrote:
> This of course begs a larger question - should we just merge Common, > HDFS & Map-Reduce together and be done with? I think there's still a reasonable long-term goal to split MapReduce from HDFS, so that they can release separately and are maintained by separate teams. So I believe a strong division of these code trees and release artifacts should remain. I'd like to get rid of Common. It could either be merged into HDFS or gradually whittled away to nothing. I'd prefer the latter. If we move to different RPC and serialization systems (e.g., Avro) then Common's io, and ipc packages might be removed. Configuration might be replaced/merged with Jakarta Commons Configuration (http://commons.apache.org/configuration/). Similarly, the metrics and fs packages might be moved to Jakarta Commons. Such changes might be hard to do back-compatibly, however. I don't see that merging the Jira databases or mailing lists for HDFS and MapReduce offers big advantages. The redundant, coordinated Jira's tend to be between Common the others, no? Doug +
Doug Cutting 2010-08-09, 16:26
-
Re: [DISCUSSION] Combine committer lists in Common/HDFS/MapReduceKonstantin Shvachko 2010-08-10, 00:40
On 8/9/2010 9:26 AM, Doug Cutting wrote: > On 08/08/2010 12:21 PM, Arun C Murthy wrote: >> This of course begs a larger question - should we just merge Common, >> HDFS& Map-Reduce together and be done with? > > I think there's still a reasonable long-term goal to split MapReduce > from HDFS, so that they can release separately and are maintained by > separate teams. So I believe a strong division of these code trees and > release artifacts should remain. I think that eventually when we solve the backward compatibility issue, with Avro, we can make MR and HDFS different products. Even though right now they are not, it would be a step back to merge them back. As we look forward to making two different products, shouldn't there be two separate sets of committers? Of course, as long as they are not really separated it makes sense to have single committers pool. > I'd like to get rid of Common. It could either be merged into HDFS or > gradually whittled away to nothing. I'd prefer the latter. If we move > to different RPC and serialization systems (e.g., Avro) then Common's > io, and ipc packages might be removed. Configuration might be > replaced/merged with Jakarta Commons Configuration > (http://commons.apache.org/configuration/). Similarly, the metrics and > fs packages might be moved to Jakarta Commons. Such changes might be > hard to do back-compatibly, however. Great idea. We should do this. > I don't see that merging the Jira databases or mailing lists for HDFS > and MapReduce offers big advantages. The redundant, coordinated Jira's > tend to be between Common the others, no? Yes. --Konstantin +
Konstantin Shvachko 2010-08-10, 00:40
-
Re: [DISCUSSION] Combine committer lists in Common/HDFS/MapReduceJeff Hammerbacher 2010-08-10, 16:44
> > I'd like to get rid of Common. It could either be merged into HDFS or
> > gradually whittled away to nothing. I'd prefer the latter. If we move > > to different RPC and serialization systems (e.g., Avro) then Common's > > io, and ipc packages might be removed. Configuration might be > > replaced/merged with Jakarta Commons Configuration > > (http://commons.apache.org/configuration/). Similarly, the metrics and > > fs packages might be moved to Jakarta Commons. Such changes might be > > hard to do back-compatibly, however. > > Great idea. We should do this. > Cool, created https://issues.apache.org/jira/browse/HADOOP-6909 to track. +
Jeff Hammerbacher 2010-08-10, 16:44
-
Re: [DISCUSSION] Combine committer lists in Common/HDFS/MapReduceSteve Loughran 2010-08-16, 09:56
On 10/08/10 17:44, Jeff Hammerbacher wrote:
>>> I'd like to get rid of Common. It could either be merged into HDFS or >>> gradually whittled away to nothing. I'd prefer the latter. If we move >>> to different RPC and serialization systems (e.g., Avro) then Common's >>> io, and ipc packages might be removed. Configuration might be >>> replaced/merged with Jakarta Commons Configuration >>> (http://commons.apache.org/configuration/). Similarly, the metrics and >>> fs packages might be moved to Jakarta Commons. Such changes might be >>> hard to do back-compatibly, however. >> >> Great idea. We should do this. >> > > Cool, created https://issues.apache.org/jira/browse/HADOOP-6909 to track. > I'm actually against this. You push all your stuff upstream -your management foundations- and you are dependent on the other stuff release schedules, and you don't have one single place to do your own layers, your own indirection. This doesn't imply anything against commons-config, just that there is benefit from having some core stuff, and it's in management that it really wins, and in shared utility stuff like in-VM httpd servers +
Steve Loughran 2010-08-16, 09:56
-
Re: [DISCUSSION] Combine committer lists in Common/HDFS/MapReduceArun C Murthy 2010-08-16, 14:27
On Aug 16, 2010, at 2:56 AM, Steve Loughran wrote: > On 10/08/10 17:44, Jeff Hammerbacher wrote: >> >> Cool, created https://issues.apache.org/jira/browse/HADOOP-6909 to >> track. >> > > I'm actually against this. You push all your stuff upstream -your > management foundations- and you are dependent on the other stuff > release > schedules, and you don't have one single place to do your own layers, > your own indirection. > > This doesn't imply anything against commons-config, just that there is > benefit from having some core stuff, and it's in management that it > really wins, and in shared utility stuff like in-VM httpd servers Agreed. Please comment on HADOOP-6909/HADOOP-6910, let us continue the discussion on jira. There are several other reasons this is infeasible including features such as support for 'final' parameters, lack of serialization etc. Arun +
Arun C Murthy 2010-08-16, 14:27
-
Re: [DISCUSSION] Combine committer lists in Common/HDFS/MapReduceEli Collins 2010-08-10, 17:24
On Mon, Aug 9, 2010 at 9:26 AM, Doug Cutting <[EMAIL PROTECTED]> wrote:
> On 08/08/2010 12:21 PM, Arun C Murthy wrote: >> >> This of course begs a larger question - should we just merge Common, >> HDFS & Map-Reduce together and be done with? > > I think there's still a reasonable long-term goal to split MapReduce from > HDFS, so that they can release separately and are maintained by separate > teams. So I believe a strong division of these code trees and release > artifacts should remain. > > I'd like to get rid of Common. It could either be merged into HDFS or > gradually whittled away to nothing. I'd prefer the latter. If we move to > different RPC and serialization systems (e.g., Avro) then Common's io, and > ipc packages might be removed. Configuration might be replaced/merged with > Jakarta Commons Configuration (http://commons.apache.org/configuration/). > Similarly, the metrics and fs packages might be moved to Jakarta Commons. > Such changes might be hard to do back-compatibly, however. Merging the o.a.h.fs back into the hdfs repo would be helpful. It's a pain to develop a file system with client and server split into multiple repositories, and the other fs implementations probably do not want their own repository since they need to get updated when the clients change as well. Developing and releasing hadoop post-project split has been a pain, going back to two repos (merging common and hdfs) or a single repo would make the life of people developing and releasing easier. As you point out, users want to consume hadoop as a single project and not worry about common, mr and hdfs as separately released and versioned components, so I'm not sure which community the split is serving. Thanks, Eli > > I don't see that merging the Jira databases or mailing lists for HDFS and > MapReduce offers big advantages. The redundant, coordinated Jira's tend to > be between Common the others, no? > > Doug > +
Eli Collins 2010-08-10, 17:24
-
Re: [DISCUSSION] Combine committer lists in Common/HDFS/MapReduceArun C Murthy 2010-08-11, 05:04
On Aug 9, 2010, at 9:26 AM, Doug Cutting wrote: > On 08/08/2010 12:21 PM, Arun C Murthy wrote: >> This of course begs a larger question - should we just merge Common, >> HDFS & Map-Reduce together and be done with? > > I think there's still a reasonable long-term goal to split MapReduce > from HDFS, so that they can release separately and are maintained by > separate teams. So I believe a strong division of these code trees > and > release artifacts should remain. It is clear from the comments on this thread that the move to split HDFS and Map-Reduce into separate projects has been a mixed bag - maybe a net negative. It has caused issues we as a community haven't dealt well with. So, we need to make a decision - do we stick the course, get through the painful process and emerge stronger or do we take a step back and regress a decision we made back then? The current proposal to merge committer lists seems like a regression. Arun +
Arun C Murthy 2010-08-11, 05:04
-
Re: [DISCUSSION] Combine committer lists in Common/HDFS/MapReduceVinod KV 2010-08-11, 05:35
I know of very few to nearly zero number of patches in mapreduce that touch HDFS also. And vice versa. The common case is of patches that touch both common and mapreduce or common and hdfs. I am one of those who has access to mapreduce project but not to common project and find more than 50% of the patches that fall into this category. (*) May be then we should simply knock off the separate list for common project and maintain separate lists for mapreduce and hdfs members of which will automatically have karma for the common project. As for the separation of the repositories, I personally felt separation of mapreduce from hdfs helped focusing on things a lot better. The last gasp work done for 0.21, mostly by Tom, did help a lot in decoupling the projects. Common is the hot point, sure, but as others noted, that is a separate discussion. My few cents. +vinod On Wednesday 11 August 2010 10:34 AM, Arun C Murthy wrote: > On Aug 9, 2010, at 9:26 AM, Doug Cutting wrote: > > >> On 08/08/2010 12:21 PM, Arun C Murthy wrote: >> >>> This of course begs a larger question - should we just merge Common, >>> HDFS& Map-Reduce together and be done with? >>> >> I think there's still a reasonable long-term goal to split MapReduce >> from HDFS, so that they can release separately and are maintained by >> separate teams. So I believe a strong division of these code trees >> and >> release artifacts should remain. >> > It is clear from the comments on this thread that the move to split > HDFS and Map-Reduce into separate projects has been a mixed bag - > maybe a net negative. > > It has caused issues we as a community haven't dealt well with. > > So, we need to make a decision - do we stick the course, get through > the painful process and emerge stronger or do we take a step back and > regress a decision we made back then? The current proposal to merge > committer lists seems like a regression. > > Arun > > > > > +
Vinod KV 2010-08-11, 05:35
-
Re: [DISCUSSION] Combine committer lists in Common/HDFS/MapReduceTom White 2010-08-11, 15:51
On Tue, Aug 10, 2010 at 10:35 PM, Vinod KV <[EMAIL PROTECTED]> wrote:
> > I am one of those who has access to > mapreduce project but not to common project This looks like an oversight to me, which we should remedy. Also, I did a little analysis, and there are only six committers out of a total of 30 that do not have commit privileges for all three subprojects. So, practically speaking, it would not be a big change to give all committers privileges across the projects. Tom +
Tom White 2010-08-11, 15:51
-
Re: [DISCUSSION] Combine committer lists in Common/HDFS/MapReduceChris Douglas 2010-08-14, 02:41
On Tue, Aug 10, 2010 at 10:35 PM, Vinod KV <[EMAIL PROTECTED]> wrote:
> > I know of very few to nearly zero number of patches in mapreduce that touch > HDFS also. And vice versa. The common case is of patches that touch both > common and mapreduce or common and hdfs. This is a good point, though changes in Common and HDFS interfaces do break MapReduce, particularly unit tests that take liberties with interface visibility. It would be convenient if HDFS committers could push in the fix with the original issue, to shrink the window where MR is broken, but in practice such changes are usually committed in short order. After all, most committers have commit access to all three projects... though this is one of the reasons why the constraint difficult to justify. > I am one of those who has access to > mapreduce project but not to common project and find more than 50% of the > patches that fall into this category. This *was* an oversight that should be corrected either through this vote or independently. > As for the separation of the repositories, I personally felt separation of > mapreduce from hdfs helped focusing on things a lot better. The last gasp > work done for 0.21, mostly by Tom, did help a lot in decoupling the > projects. Common is the hot point, sure, but as others noted, that is a > separate discussion. +1 ---- The discussion appears to be dying down. Quick summary of comments so far: * That all HDFS and MapReduce committers should have commit rights to Common appears to be undisputed. * The value of splitting the projects at all is disputed. Some have argued that it has complicated work without delivering the benefits to developers it promised, though others have experienced this as discipline rather than inconvenience. Most of these complaints reference patches that touch Common, particularly actively-developed packages like fs. The vote concerns a narrower question than the project split, though it's fair to assert a lack of unanimity on the premise of a split Hadoop, let alone the more limited question of whether to retain a split list of committers. * Not everyone agrees that combining HDFS and MapReduce committers is sound. While there is sufficient overlap today to branch all three together, patch releases could- and likely will- be cut independently. Not everyone thinks the two projects are developing independent communities, but none have difficulty imagining TLP status for both. Please feel free to amend this if I left out an important point. ----- I think we're ready to vote. Though we have no bylaws to amend, this would be a modification to them, I guess. The last-proposed set required a 2/3 majority of the PMC, IIRC. Since adding a committer requires consensus on the PMC, it's probably fair to require a 2/3 majority to cross-pollenate lists instead of a simple majority. Though the vote could be conducted on a huge cross-post to mapreduce-dev@, hdfs-dev@ and common-dev@, it'll be easier to count if it's run on general@; I'll start it here on Monday if nobody minds. -C +
Chris Douglas 2010-08-14, 02:41
-
Re: [DISCUSSION] Combine committer lists in Common/HDFS/MapReduceStack 2010-08-07, 04:45
+1 on committers getting access to common+hdfs+mr.
Thanks for putting up the comprehensive pros and cons Chris. It makes it so I don't have to think. The only CON that gives me pause is item 3). but my guess is that its unlikely and that if it should ever happen, as a community, we'd figure some way around (over/under) the roadblock. For those who might be strong on HDFS but not on MR (or vice-versa), I think its OK having 'privileges' though you may not exercise them. In being nominated for committer status, a nominee has probably passed the baseline common sense test that has it that they'll NOT be applying patches in areas they do not well-understand. St.Ack +
Stack 2010-08-07, 04:45
-
Re: [DISCUSSION] Combine committer lists in Common/HDFS/MapReduceTom White 2010-08-09, 16:21
+1 to having a single committer list for Common, HDFS, and MapReduce.
The projects are currently closely aligned and there have been cases where a committer couldn't commit what was logically a single patch because it was split between Common and MapReduce. I think we should make this change regardless of whether we ultimately split the projects into TLPs, we merge them back into a single project, or we keep them as three subprojects within the Hadoop TLP (as they are now). I would also suggest that the question of what to do about the project split, although related, probably deserves a separate discussion. Tom On Fri, Aug 6, 2010 at 2:02 PM, Chris Douglas <[EMAIL PROTECTED]> wrote: > Hadoop developers tend to specialize in either HDFS or MapReduce, but > given that: > > 0) Granting karma to Common is routine for a committer in either > space; there are no Common-only committers > 1) The majority of committers have been grandfathered into committer > roles in all three projects > 2) Many patches to Common require corresponding commits to both HDFS > and MapReduce > 3) Review-then-commit is usually sufficient notice for interested > parties to comment > 4) There have been few problems with committers pushing in patches > without consulting someone more directly involved > 5) Everyone on the PMC gets commit rights to all subprojects, anyway > > Perhaps it would make sense to give up on separate committer roles > until the projects are separate TLPs. > > On the other hand: > > 0) Nobody has been independently added to both HDFS and MapReduce > since the projects were separated > 1) It could exacerbate the focus on MapReduce in HDFS, at the expense > of other projects (like HBase). > 2) HDFS and MapReduce are mostly independent communities and > codebases; expertise in one does not imply fluency in the other > 3) Granting veto power across projects can lead to deadlock despite > consensus within that community > 4) TLP status for either project may require untangling HDFS/MR roles > that could be distinguished now > > Personally, I'm in favor of combining the roles. I trust all six of > the committers made since the project split no less than those made > earlier. Further, version control is sufficient for recovering from > most, foreseeable issues. I have some concerns about "harmless" > commits pushed through without an audit by the subproject's > maintainers (a few in recent memory caused downtime in Y! clusters), > but combining the roles seems like a worthwhile experiment. > > Thoughts? -C > +
Tom White 2010-08-09, 16:21
-
Re: [DISCUSSION] Combine committer lists in Common/HDFS/MapReduceNigel Daley 2010-08-11, 04:02
+1 on merging the committer lists for now. I think separating them was trying to solve for problems that we don't actually have in our community. We can always separate them again later if needed.
n. On Aug 6, 2010, at 2:02 PM, Chris Douglas wrote: > Hadoop developers tend to specialize in either HDFS or MapReduce, but > given that: > > 0) Granting karma to Common is routine for a committer in either > space; there are no Common-only committers > 1) The majority of committers have been grandfathered into committer > roles in all three projects > 2) Many patches to Common require corresponding commits to both HDFS > and MapReduce > 3) Review-then-commit is usually sufficient notice for interested > parties to comment > 4) There have been few problems with committers pushing in patches > without consulting someone more directly involved > 5) Everyone on the PMC gets commit rights to all subprojects, anyway > > Perhaps it would make sense to give up on separate committer roles > until the projects are separate TLPs. > > On the other hand: > > 0) Nobody has been independently added to both HDFS and MapReduce > since the projects were separated > 1) It could exacerbate the focus on MapReduce in HDFS, at the expense > of other projects (like HBase). > 2) HDFS and MapReduce are mostly independent communities and > codebases; expertise in one does not imply fluency in the other > 3) Granting veto power across projects can lead to deadlock despite > consensus within that community > 4) TLP status for either project may require untangling HDFS/MR roles > that could be distinguished now > > Personally, I'm in favor of combining the roles. I trust all six of > the committers made since the project split no less than those made > earlier. Further, version control is sufficient for recovering from > most, foreseeable issues. I have some concerns about "harmless" > commits pushed through without an audit by the subproject's > maintainers (a few in recent memory caused downtime in Y! clusters), > but combining the roles seems like a worthwhile experiment. > > Thoughts? -C +
Nigel Daley 2010-08-11, 04:02
-
Re: [DISCUSSION] Combine committer lists in Common/HDFS/MapReduceKonstantin Boudnik 2010-08-09, 18:47
+1 on combining the committer roles in one (as Doug has mentioned elsewhere,
all three are feel like a single project and are released as such, therefore it doesn't make much sense to split committers). In effect, MR testing that pulls in "external" HDFS dependencies works as a poor-man integration testing. -1 on combining the projects back together. While having them separated might cause an extra maintenance burden, the same separation forces more clean design decisions. Cos On Fri, Aug 06, 2010 at 02:02PM, Chris Douglas wrote: > Hadoop developers tend to specialize in either HDFS or MapReduce, but > given that: > > 0) Granting karma to Common is routine for a committer in either > space; there are no Common-only committers > 1) The majority of committers have been grandfathered into committer > roles in all three projects > 2) Many patches to Common require corresponding commits to both HDFS > and MapReduce > 3) Review-then-commit is usually sufficient notice for interested > parties to comment > 4) There have been few problems with committers pushing in patches > without consulting someone more directly involved > 5) Everyone on the PMC gets commit rights to all subprojects, anyway > > Perhaps it would make sense to give up on separate committer roles > until the projects are separate TLPs. > > On the other hand: > > 0) Nobody has been independently added to both HDFS and MapReduce > since the projects were separated > 1) It could exacerbate the focus on MapReduce in HDFS, at the expense > of other projects (like HBase). > 2) HDFS and MapReduce are mostly independent communities and > codebases; expertise in one does not imply fluency in the other > 3) Granting veto power across projects can lead to deadlock despite > consensus within that community > 4) TLP status for either project may require untangling HDFS/MR roles > that could be distinguished now > > Personally, I'm in favor of combining the roles. I trust all six of > the committers made since the project split no less than those made > earlier. Further, version control is sufficient for recovering from > most, foreseeable issues. I have some concerns about "harmless" > commits pushed through without an audit by the subproject's > maintainers (a few in recent memory caused downtime in Y! clusters), > but combining the roles seems like a worthwhile experiment. > > Thoughts? -C +
Konstantin Boudnik 2010-08-09, 18:47
-
Re: [DISCUSSION] Combine committer lists in Common/HDFS/MapReduceDhruba Borthakur 2010-08-14, 01:53
+1 on combining committers list for common + hdfs + mapreduce. It is mostly
the same set of users who contribute to all three projects. thanks, dhruba On Mon, Aug 9, 2010 at 11:47 AM, Konstantin Boudnik <[EMAIL PROTECTED]> wrote: > +1 on combining the committer roles in one (as Doug has mentioned > elsewhere, > all three are feel like a single project and are released as such, > therefore > it doesn't make much sense to split committers). In effect, MR testing that > pulls in "external" HDFS dependencies works as a poor-man integration > testing. > > -1 on combining the projects back together. While having them separated > might > cause an extra maintenance burden, the same separation forces more clean > design decisions. > > Cos > > On Fri, Aug 06, 2010 at 02:02PM, Chris Douglas wrote: > > Hadoop developers tend to specialize in either HDFS or MapReduce, but > > given that: > > > > 0) Granting karma to Common is routine for a committer in either > > space; there are no Common-only committers > > 1) The majority of committers have been grandfathered into committer > > roles in all three projects > > 2) Many patches to Common require corresponding commits to both HDFS > > and MapReduce > > 3) Review-then-commit is usually sufficient notice for interested > > parties to comment > > 4) There have been few problems with committers pushing in patches > > without consulting someone more directly involved > > 5) Everyone on the PMC gets commit rights to all subprojects, anyway > > > > Perhaps it would make sense to give up on separate committer roles > > until the projects are separate TLPs. > > > > On the other hand: > > > > 0) Nobody has been independently added to both HDFS and MapReduce > > since the projects were separated > > 1) It could exacerbate the focus on MapReduce in HDFS, at the expense > > of other projects (like HBase). > > 2) HDFS and MapReduce are mostly independent communities and > > codebases; expertise in one does not imply fluency in the other > > 3) Granting veto power across projects can lead to deadlock despite > > consensus within that community > > 4) TLP status for either project may require untangling HDFS/MR roles > > that could be distinguished now > > > > Personally, I'm in favor of combining the roles. I trust all six of > > the committers made since the project split no less than those made > > earlier. Further, version control is sufficient for recovering from > > most, foreseeable issues. I have some concerns about "harmless" > > commits pushed through without an audit by the subproject's > > maintainers (a few in recent memory caused downtime in Y! clusters), > > but combining the roles seems like a worthwhile experiment. > > > > Thoughts? -C > -- Connect to me at http://www.facebook.com/dhruba +
Dhruba Borthakur 2010-08-14, 01:53
|