|
Stack
2010-04-08, 04:02
Alan Gates
2010-04-08, 17:12
Doug Cutting
2010-04-08, 17:50
Stack
2010-04-09, 00:15
Jay Booth
2010-04-09, 01:41
Arun C Murthy
2010-04-09, 02:02
Jay Booth
2010-04-09, 03:40
Tom White
2010-04-09, 04:00
Imran M Yousuf
2010-04-09, 04:07
Jay Booth
2010-04-09, 06:09
Stack
2010-04-12, 15:42
|
-
[DISCUSS] HBase as TLPStack 2010-04-08, 04:02
The HBase subproject has voted to become a TLP: http://su.pr/1g0HAN
Does the Hadoop community have any questions or concerns about this proposal? Please don't vote yet in response to this. I'll call a formal vote after questions, if any, have been resolved. Thanks, St.Ack (Thanks for the boiler plate Doug Cutting)
-
Re: [DISCUSS] HBase as TLPAlan Gates 2010-04-08, 17:12
I have a concern. HDFS, mapreduce, Hbase, Hive, and Pig taken
together form a coherent software stack. I suspect that many users see this as a whole, in the same way they see a Linux distribution as a whole, without remembering that Linux is really the kernel while other GNU components are added into the distribution. HDFS and mapreduce form a base on which the other projects depend. Hbase, Hive, and Pig function to extend the base to many more users, both in terms of making it easier to use and bringing new functionality. Thus splitting them up does not make sense to me. They form a whole, why not keep them as a whole? I know that the response will be that becoming a top level project doesn't mean they cannot continue to function as a whole; that this is merely a governance issue and not an issue of how projects work together (see for example http://bit.ly/9ylAYS). But I remain skeptical. The structure of governing hierarchies always influence cohesion of a group. It would help me if advocates of this position could point to successful, separate Apache TLPs that are either completely dependent on another project (as HBase would be on Hadoop) or significantly dependent to extend functionality (as Hadoop would be on Hbase). This is not to say that I do not understand the value of having PMCs from these growing subprojects report directly to Apache. But I am concerned that we are letting this valid concern overrule other equally valid concerns without considering the tradeoffs. Alan. On Apr 7, 2010, at 9:02 PM, Stack wrote: > The HBase subproject has voted to become a TLP: http://su.pr/1g0HAN > > Does the Hadoop community have any questions or concerns about this > proposal? > > Please don't vote yet in response to this. I'll call a formal vote > after questions, if any, have been resolved. > > Thanks, > St.Ack > (Thanks for the boiler plate Doug Cutting)
-
Re: [DISCUSS] HBase as TLPDoug Cutting 2010-04-08, 17:50
Alan Gates wrote:
> It would help me if advocates of this position could point to > successful, separate Apache TLPs that are either completely dependent on > another project (as HBase would be on Hadoop) or significantly dependent > to extend functionality (as Hadoop would be on Hbase). HTTPD requires APR. Jackrabbit, Roller and probably others require Lucene. Nutch requires Hadoop. Continuum and others require Derby. Doug
-
Re: [DISCUSS] HBase as TLPStack 2010-04-09, 00:15
On Thu, Apr 8, 2010 at 10:12 AM, Alan Gates <[EMAIL PROTECTED]> wrote:
> I have a concern. HDFS, mapreduce, Hbase, Hive, and Pig taken together form > a coherent software stack. ... if it had been designed by Dr. Frankenstein (half-joke!). I think it a stretch describing the Hadoop suite 'coherent'. Pieces in the above stack can be swapped out and other components that do similar function put in their place. Rare is the install that uses all of the elements above (and the above is not a complete list of all Hadoop subprojects). Some of the components have inter-dependencies but others have no dependency on other subprojects at all and can be run w/o reference to other pieces of the Hadoop set. For many, Hadoop looks more like a 'bucket' of software than it is a 'coherent' software stack. I'm trying to undo the premise upon which the rest of your mail depends. St.Ack
-
Re: [DISCUSS] HBase as TLPJay Booth 2010-04-09, 01:41
What if the projects were:
A) split out to TLPs because they do seem to have reached that level of individual community but, B) The projects could somehow jointly put out an integrated build containing the above projects and let users run whatever they want out of it? That would require a lot of coordination but would make a heck of a 1.0 release, if there's a reference build then it'll make it easier to identify/fix cross-component bugs and the release process might surface more of them early. It seems like that could help with a number of St.Ack's concerns about reanimated monsters. Maybe it could be owned by the Common project, with participation from the project leads? Just an idea. -Jay On Thu, Apr 8, 2010 at 8:15 PM, Stack <[EMAIL PROTECTED]> wrote: > On Thu, Apr 8, 2010 at 10:12 AM, Alan Gates <[EMAIL PROTECTED]> wrote: > > I have a concern. HDFS, mapreduce, Hbase, Hive, and Pig taken together > form > > a coherent software stack. > > ... if it had been designed by Dr. Frankenstein (half-joke!). > > I think it a stretch describing the Hadoop suite 'coherent'. Pieces > in the above stack can be swapped out and other components that do > similar function put in their place. Rare is the install that uses > all of the elements above (and the above is not a complete list of all > Hadoop subprojects). Some of the components have inter-dependencies > but others have no dependency on other subprojects at all and can be > run w/o reference to other pieces of the Hadoop set. For many, Hadoop > looks more like a 'bucket' of software than it is a 'coherent' > software stack. > > I'm trying to undo the premise upon which the rest of your mail depends. > St.Ack >
-
Re: [DISCUSS] HBase as TLPArun C Murthy 2010-04-09, 02:02
On Apr 8, 2010, at 6:41 PM, Jay Booth wrote: > What if the projects were: > > A) split out to TLPs because they do seem to have reached that > level of > individual community > > but, > > B) The projects could somehow jointly put out an integrated build > containing the above projects and let users run whatever they want > out of > it? > > That would require a lot of coordination but would make a heck of a > 1.0 > release, 1.0 release of what? Arun
-
Re: [DISCUSS] HBase as TLPJay Booth 2010-04-09, 03:40
Not sure exactly what I meant by "1.0 of what", "Hadoop" I guess, I was
trying to address the concerns raised, which I share -- Alan's concern is that if the projects are completely separate from each other, that might decrease visibility as to the demands they're placing on each other when integrated, and St.Ack mentioned the frankenstein factor which I think we've all felt some pain from, and which may get worse after the project split. What's the standard way to deploy the three, even? Is there one? If the PMCs jointly maintained some sort of 'stable integrated build' which took in new releases from the TLPs as they were released after a soak period, it could provide a common touchstone that bugs could be tested against and cross-component patches delivered against, potentially increasing visibility of cross-component issues while providing a less cobbled-together system to administrate. On the other side, though, if executed wrong, you'd be creating a committee of committees and possibly undoing some of the benefits of going TLP in the first place, especially if politics heat up over what goes into the 'standard' build. I think it could be viable though. On Thu, Apr 8, 2010 at 10:02 PM, Arun C Murthy <[EMAIL PROTECTED]> wrote: > > On Apr 8, 2010, at 6:41 PM, Jay Booth wrote: > > What if the projects were: >> >> A) split out to TLPs because they do seem to have reached that level of >> individual community >> >> but, >> >> B) The projects could somehow jointly put out an integrated build >> containing the above projects and let users run whatever they want out of >> it? >> >> That would require a lot of coordination but would make a heck of a 1.0 >> release, >> > > > 1.0 release of what? > > Arun >
-
Re: [DISCUSS] HBase as TLPTom White 2010-04-09, 04:00
Eclipse does big bang releases of multiple components, but I believe
it requires a huge amount of coordination and planning. Instead, I think the direction Hadoop should move in is to stabilize and clearly demarcate its core filesystem and MapReduce interfaces, so that projects like HBase, Pig, and Hive can run against multiple versions of core. Their release cycles are already largely decoupled from core, so the question about whether they become TLPs is more to do with project governance than with release coordination. Cheers, Tom On Thu, Apr 8, 2010 at 8:40 PM, Jay Booth <[EMAIL PROTECTED]> wrote: > Not sure exactly what I meant by "1.0 of what", "Hadoop" I guess, I was > trying to address the concerns raised, which I share -- Alan's concern is > that if the projects are completely separate from each other, that might > decrease visibility as to the demands they're placing on each other when > integrated, and St.Ack mentioned the frankenstein factor which I think we've > all felt some pain from, and which may get worse after the project split. > What's the standard way to deploy the three, even? Is there one? > > If the PMCs jointly maintained some sort of 'stable integrated build' which > took in new releases from the TLPs as they were released after a soak > period, it could provide a common touchstone that bugs could be tested > against and cross-component patches delivered against, potentially > increasing visibility of cross-component issues while providing a less > cobbled-together system to administrate. On the other side, though, if > executed wrong, you'd be creating a committee of committees and possibly > undoing some of the benefits of going TLP in the first place, especially if > politics heat up over what goes into the 'standard' build. I think it could > be viable though. > > > > On Thu, Apr 8, 2010 at 10:02 PM, Arun C Murthy <[EMAIL PROTECTED]> wrote: > >> >> On Apr 8, 2010, at 6:41 PM, Jay Booth wrote: >> >> What if the projects were: >>> >>> A) split out to TLPs because they do seem to have reached that level of >>> individual community >>> >>> but, >>> >>> B) The projects could somehow jointly put out an integrated build >>> containing the above projects and let users run whatever they want out of >>> it? >>> >>> That would require a lot of coordination but would make a heck of a 1.0 >>> release, >>> >> >> >> 1.0 release of what? >> >> Arun >> >
-
Re: [DISCUSS] HBase as TLPImran M Yousuf 2010-04-09, 04:07
+1
I feel the same. From following HBase seeing its releases depending directly on Hadoop release gets me thinking... Best regards, Imran On Fri, Apr 9, 2010 at 9:45 AM, Tom White <[EMAIL PROTECTED]> wrote: > Eclipse does big bang releases of multiple components, but I believe > it requires a huge amount of coordination and planning. Instead, I > think the direction Hadoop should move in is to stabilize and clearly > demarcate its core filesystem and MapReduce interfaces, so that > projects like HBase, Pig, and Hive can run against multiple versions > of core. Their release cycles are already largely decoupled from core, > so the question about whether they become TLPs is more to do with > project governance than with release coordination. > > Cheers, > Tom > > On Thu, Apr 8, 2010 at 8:40 PM, Jay Booth <[EMAIL PROTECTED]> wrote: >> Not sure exactly what I meant by "1.0 of what", "Hadoop" I guess, I was >> trying to address the concerns raised, which I share -- Alan's concern is >> that if the projects are completely separate from each other, that might >> decrease visibility as to the demands they're placing on each other when >> integrated, and St.Ack mentioned the frankenstein factor which I think we've >> all felt some pain from, and which may get worse after the project split. >> What's the standard way to deploy the three, even? Is there one? >> >> If the PMCs jointly maintained some sort of 'stable integrated build' which >> took in new releases from the TLPs as they were released after a soak >> period, it could provide a common touchstone that bugs could be tested >> against and cross-component patches delivered against, potentially >> increasing visibility of cross-component issues while providing a less >> cobbled-together system to administrate. On the other side, though, if >> executed wrong, you'd be creating a committee of committees and possibly >> undoing some of the benefits of going TLP in the first place, especially if >> politics heat up over what goes into the 'standard' build. I think it could >> be viable though. >> >> >> >> On Thu, Apr 8, 2010 at 10:02 PM, Arun C Murthy <[EMAIL PROTECTED]> wrote: >> >>> >>> On Apr 8, 2010, at 6:41 PM, Jay Booth wrote: >>> >>> What if the projects were: >>>> >>>> A) split out to TLPs because they do seem to have reached that level of >>>> individual community >>>> >>>> but, >>>> >>>> B) The projects could somehow jointly put out an integrated build >>>> containing the above projects and let users run whatever they want out of >>>> it? >>>> >>>> That would require a lot of coordination but would make a heck of a 1.0 >>>> release, >>>> >>> >>> >>> 1.0 release of what? >>> >>> Arun >>> >> > -- Imran M Yousuf Entrepreneur & Software Engineer Smart IT Engineering Dhaka, Bangladesh Email: [EMAIL PROTECTED] Blog: http://imyousuf-tech.blogs.smartitengineering.com/ Mobile: +880-1711402557
-
Re: [DISCUSS] HBase as TLPJay Booth 2010-04-09, 06:09
Alright, I totally agree. Thanks for putting it that way.
-Jay On Fri, Apr 9, 2010 at 12:07 AM, Imran M Yousuf <[EMAIL PROTECTED]> wrote: > +1 > > I feel the same. From following HBase seeing its releases depending > directly on Hadoop release gets me thinking... > > Best regards, > > Imran > > On Fri, Apr 9, 2010 at 9:45 AM, Tom White <[EMAIL PROTECTED]> wrote: > > Eclipse does big bang releases of multiple components, but I believe > > it requires a huge amount of coordination and planning. Instead, I > > think the direction Hadoop should move in is to stabilize and clearly > > demarcate its core filesystem and MapReduce interfaces, so that > > projects like HBase, Pig, and Hive can run against multiple versions > > of core. Their release cycles are already largely decoupled from core, > > so the question about whether they become TLPs is more to do with > > project governance than with release coordination. > > > > Cheers, > > Tom > > > > On Thu, Apr 8, 2010 at 8:40 PM, Jay Booth <[EMAIL PROTECTED]> wrote: > >> Not sure exactly what I meant by "1.0 of what", "Hadoop" I guess, I was > >> trying to address the concerns raised, which I share -- Alan's concern > is > >> that if the projects are completely separate from each other, that might > >> decrease visibility as to the demands they're placing on each other when > >> integrated, and St.Ack mentioned the frankenstein factor which I think > we've > >> all felt some pain from, and which may get worse after the project > split. > >> What's the standard way to deploy the three, even? Is there one? > >> > >> If the PMCs jointly maintained some sort of 'stable integrated build' > which > >> took in new releases from the TLPs as they were released after a soak > >> period, it could provide a common touchstone that bugs could be tested > >> against and cross-component patches delivered against, potentially > >> increasing visibility of cross-component issues while providing a less > >> cobbled-together system to administrate. On the other side, though, if > >> executed wrong, you'd be creating a committee of committees and possibly > >> undoing some of the benefits of going TLP in the first place, especially > if > >> politics heat up over what goes into the 'standard' build. I think it > could > >> be viable though. > >> > >> > >> > >> On Thu, Apr 8, 2010 at 10:02 PM, Arun C Murthy <[EMAIL PROTECTED]> > wrote: > >> > >>> > >>> On Apr 8, 2010, at 6:41 PM, Jay Booth wrote: > >>> > >>> What if the projects were: > >>>> > >>>> A) split out to TLPs because they do seem to have reached that level > of > >>>> individual community > >>>> > >>>> but, > >>>> > >>>> B) The projects could somehow jointly put out an integrated build > >>>> containing the above projects and let users run whatever they want out > of > >>>> it? > >>>> > >>>> That would require a lot of coordination but would make a heck of a > 1.0 > >>>> release, > >>>> > >>> > >>> > >>> 1.0 release of what? > >>> > >>> Arun > >>> > >> > > > > > > -- > Imran M Yousuf > Entrepreneur & Software Engineer > Smart IT Engineering > Dhaka, Bangladesh > Email: [EMAIL PROTECTED] > Blog: http://imyousuf-tech.blogs.smartitengineering.com/ > Mobile: +880-1711402557 >
-
Re: [DISCUSS] HBase as TLPStack 2010-04-12, 15:42
Its been a while since there's been a peep out of this thread so I'll
now move this topic to a vote. Thanks to all who contributed to the discussion. St.Ack On Thu, Apr 8, 2010 at 11:09 PM, Jay Booth <[EMAIL PROTECTED]> wrote: > Alright, I totally agree. Thanks for putting it that way. > > -Jay > > On Fri, Apr 9, 2010 at 12:07 AM, Imran M Yousuf <[EMAIL PROTECTED]> wrote: > >> +1 >> >> I feel the same. From following HBase seeing its releases depending >> directly on Hadoop release gets me thinking... >> >> Best regards, >> >> Imran >> >> On Fri, Apr 9, 2010 at 9:45 AM, Tom White <[EMAIL PROTECTED]> wrote: >> > Eclipse does big bang releases of multiple components, but I believe >> > it requires a huge amount of coordination and planning. Instead, I >> > think the direction Hadoop should move in is to stabilize and clearly >> > demarcate its core filesystem and MapReduce interfaces, so that >> > projects like HBase, Pig, and Hive can run against multiple versions >> > of core. Their release cycles are already largely decoupled from core, >> > so the question about whether they become TLPs is more to do with >> > project governance than with release coordination. >> > >> > Cheers, >> > Tom >> > >> > On Thu, Apr 8, 2010 at 8:40 PM, Jay Booth <[EMAIL PROTECTED]> wrote: >> >> Not sure exactly what I meant by "1.0 of what", "Hadoop" I guess, I was >> >> trying to address the concerns raised, which I share -- Alan's concern >> is >> >> that if the projects are completely separate from each other, that might >> >> decrease visibility as to the demands they're placing on each other when >> >> integrated, and St.Ack mentioned the frankenstein factor which I think >> we've >> >> all felt some pain from, and which may get worse after the project >> split. >> >> What's the standard way to deploy the three, even? Is there one? >> >> >> >> If the PMCs jointly maintained some sort of 'stable integrated build' >> which >> >> took in new releases from the TLPs as they were released after a soak >> >> period, it could provide a common touchstone that bugs could be tested >> >> against and cross-component patches delivered against, potentially >> >> increasing visibility of cross-component issues while providing a less >> >> cobbled-together system to administrate. On the other side, though, if >> >> executed wrong, you'd be creating a committee of committees and possibly >> >> undoing some of the benefits of going TLP in the first place, especially >> if >> >> politics heat up over what goes into the 'standard' build. I think it >> could >> >> be viable though. >> >> >> >> >> >> >> >> On Thu, Apr 8, 2010 at 10:02 PM, Arun C Murthy <[EMAIL PROTECTED]> >> wrote: >> >> >> >>> >> >>> On Apr 8, 2010, at 6:41 PM, Jay Booth wrote: >> >>> >> >>> What if the projects were: >> >>>> >> >>>> A) split out to TLPs because they do seem to have reached that level >> of >> >>>> individual community >> >>>> >> >>>> but, >> >>>> >> >>>> B) The projects could somehow jointly put out an integrated build >> >>>> containing the above projects and let users run whatever they want out >> of >> >>>> it? >> >>>> >> >>>> That would require a lot of coordination but would make a heck of a >> 1.0 >> >>>> release, >> >>>> >> >>> >> >>> >> >>> 1.0 release of what? >> >>> >> >>> Arun >> >>> >> >> >> > >> >> >> >> -- >> Imran M Yousuf >> Entrepreneur & Software Engineer >> Smart IT Engineering >> Dhaka, Bangladesh >> Email: [EMAIL PROTECTED] >> Blog: http://imyousuf-tech.blogs.smartitengineering.com/ >> Mobile: +880-1711402557 >> > |