|
Matt Foley
2012-11-30, 01:51
Alejandro Abdelnur
2012-11-30, 01:25
Matt Foley
2012-11-30, 02:26
Chuan Liu
2012-11-30, 03:22
Bikas Saha
2012-11-30, 04:27
Luke Lu
2012-11-30, 11:21
Luke Lu
2012-11-30, 12:57
Steve Loughran
2012-11-30, 13:29
Luke Lu
2012-11-30, 14:02
Luke Lu
2012-11-30, 13:49
Arun C Murthy
2012-12-02, 18:20
Radim Kolar
2012-11-30, 00:29
Steve Loughran
2012-11-30, 13:20
Radim Kolar
2012-11-30, 13:40
Jitendra Pandey
2012-11-30, 22:49
Steve Loughran
2012-12-01, 10:48
Matt Foley
2012-11-24, 20:13
Ivan Mitic
2012-11-29, 23:41
Mahadevan Venkatraman
2012-11-30, 02:07
Raja Aluri
2012-12-01, 00:57
Eli Collins
2012-12-01, 01:08
Steve Loughran
2012-12-01, 10:44
Doug Cutting
2012-12-01, 18:23
Konstantin Boudnik
2012-12-13, 00:53
Doug Cutting
2012-11-30, 16:55
Joep Rottinghuis
2012-12-01, 20:28
Eric Yang
2012-12-02, 06:07
Konstantin Boudnik
2012-12-13, 00:55
Tom White
2012-12-03, 14:23
Chris Nauroth
2012-11-25, 07:18
Suresh Srinivas
2012-11-26, 20:41
Konstantin Boudnik
2012-11-26, 18:30
Radim Kolar
2012-11-26, 17:34
Colin McCabe
2012-11-26, 16:53
Chris Nauroth
2012-11-26, 17:44
Luke Lu
2012-11-26, 17:25
Giridharan Kesavan
2012-11-26, 21:16
Alejandro Abdelnur
2012-11-26, 21:52
Radim Kolar
2012-11-26, 22:17
Robert Evans
2012-11-26, 16:16
Adam Berry
2012-11-26, 16:45
Steve Loughran
2012-11-25, 12:39
Doug Cutting
2012-12-03, 18:37
Matt Foley
2012-12-03, 19:21
Doug Cutting
2012-12-03, 19:37
Matt Foley
2012-12-03, 22:08
Doug Cutting
2012-12-03, 23:57
Matt Foley
2012-12-04, 01:22
Doug Cutting
2012-12-04, 04:50
Matt Foley
2012-12-04, 17:58
Radim Kolar
2012-12-04, 19:41
Matt Foley
2012-12-04, 20:28
Alejandro Abdelnur
2012-12-04, 21:00
Matt Foley
2012-12-04, 22:35
|
-
Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stackMatt Foley 2012-11-30, 01:51
>> Python as runtime requirement. Are you planing to migrate all
BASH scripts provided by Hadoop (or dynamically created -ie launcher scripts) to Python? I don't intend to mandate use of Python. Rather, I want there to be a cross-platform option available. Things that are best done in platform-specific manner, should be done in shell for linux, and powershell for windows. But things that are best done in a platform-independent way, can be, with a lower long-term maintenance cost than using different scripts per platform. This means that some, but not all, existing scripts may naturally migrate to Python as the overall system is ported to Windows. Hopefully when someone is porting a script that can be well done in a platform-independent way, they will be able to choose Python and write a single script that can replace the shell script and make it unnecessary to maintain two scripts (doing the same job but in different languages!) going forward. >> What else in the current build, besides saveVersion.sh, you see as candidate to be migrated to Python? I have a greatly improved version of src/docs/relnotes.py that I would like to submit, for auto-gen of release notes. That's all that I have on my hotlist right now, although I anticipate that some of the shell scripts invoked by ant may be natural candidates. >> How are you planning to define what Python modules can be used? Will developers have to install them manually? That's something the community will work out, the same way they decide what library jars to include, and when to upgrade those versions. But first, let's get an agreement in principle that this is the direction we want to go. Cheers, --Matt On Thu, Nov 29, 2012 at 3:26 PM, Alejandro Abdelnur <[EMAIL PROTECTED]>wrote: > Matt, thanks for the clarification. > > I may have missed the main point of the PROPOSAL thread then. I personally > want to continue the discussion before voting. > > * Phyton as runtime requirement. Are you planing to migrate all BASH > scripts provided by Hadoop (or dynamically created -ie launcher scripts) > to Phyton? > * What else in the current build, besides saveVersion.sh, you see as > candidate to be migrated to Phyton? > * How are you planning to define what Phyton modules can be used? Will > developers have to install them manually? > > Cheers > > > On Thu, Nov 29, 2012 at 2:39 PM, Matt Foley <[EMAIL PROTECTED]> > wrote: > > > Hi Alejandro, > > Please see in-line below. > > > > On Mon, Nov 26, 2012 at 1:52 PM, Alejandro Abdelnur <[EMAIL PROTECTED]> > > wrote: > > > > > Matt, > > > > > > The scope of this vote seems different from what was discussed in the > > > PROPOSAL thread. > > > In the PROPOSAL thread you indicated this was for Hadoop1 because it is > > ANT > > > based. And the main reason was to remove saveVersion.sh. > > > Your #3 was not discussed in the proposal, was it? > > > > > > > The item #3 was in my original statement of the problem, with which I > > started the proposal thread. In fact, the thread title was "[PROPOSAL] > > introduce Python as build-time and run-time dependency for Hadoop and > > throughout Hadoop stack". It is true that only one or two people chose > to > > discuss #3 further in that thread. > > > > The point is not just to replace a single script, but to provide a means > to > > do cross-platform scripts, which will over time replace many > > non-platform-specific scripts written in platform-specific languages. > > > > > > > > > > It seems this vote is dragging much more stuff it was originally > > discussed. > > > I think you should suspend the vote, recap the motivation and then > > restart > > > the vote. > > > > > > > I respectfully disagree. I believe a careful reading of the cited > > discussion thread, plus my own statement of the vote, provides sufficient > > background for a thoughtful decision on the subject. Presumably so do > the > > ten other people who had already voted before you made that comment. > > > > If several other people want more discussion first, please speak up. +
Matt Foley 2012-11-30, 01:51
-
Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stackAlejandro Abdelnur 2012-11-30, 01:25
Matt,
Let me repost my previous questions and a few more. I'd appreciate your answers, as it will help me understand the full impact this would have in Hadoop and related projects. * Phyton as runtime requirement. Are you planing to migrate all BASH scripts provided by Hadoop (or dynamically created -ie launcher scripts) to Phyton? * What else in the current build, besides saveVersion.sh, you see as candidate to be migrated to Phyton? * How are you planning to define what Phyton modules can be used? Will developers have to install them manually? * What kind of tasks you envision Python scripts will enable that are not possible today? * Will the requirement of Python be pushed to clients using the hadoop script? If so, this would affect all downstream projects that use hadoop script in one why or the other, right? Is the main motivation of the proposal to make things easier for window, so there is no need for cygwin? If that is the case, have you considered doing directly BAT scripts? If you take Tomcat for example, they have BAT scripts and SH scripts and things work quite nicely. Personally, I wouldn't be trilled to see the logic in the scripts to get more complex, but on the opposite direction; IMO, scripts should be trimmed to set env vars (with no voodoo logic), build the classpath (with no voodoo logic, just from a set of dirs) and call Java. Finally, this is code change, so I'm not sure why we are doing a vote. Thx. On Thu, Nov 29, 2012 at 3:26 PM, Alejandro Abdelnur <[EMAIL PROTECTED]>wrote: > Matt, thanks for the clarification. > > I may have missed the main point of the PROPOSAL thread then. I personally > want to continue the discussion before voting. > > * Phyton as runtime requirement. Are you planing to migrate all BASH > scripts provided by Hadoop (or dynamically created -ie launcher scripts) > to Phyton? > * What else in the current build, besides saveVersion.sh, you see as > candidate to be migrated to Phyton? > * How are you planning to define what Phyton modules can be used? Will > developers have to install them manually? > > Cheers > > > On Thu, Nov 29, 2012 at 2:39 PM, Matt Foley <[EMAIL PROTECTED]>wrote: > >> Hi Alejandro, >> Please see in-line below. >> >> On Mon, Nov 26, 2012 at 1:52 PM, Alejandro Abdelnur <[EMAIL PROTECTED]> >> wrote: >> >> > Matt, >> > >> > The scope of this vote seems different from what was discussed in the >> > PROPOSAL thread. >> > In the PROPOSAL thread you indicated this was for Hadoop1 because it is >> ANT >> > based. And the main reason was to remove saveVersion.sh. >> > Your #3 was not discussed in the proposal, was it? >> > >> >> The item #3 was in my original statement of the problem, with which I >> started the proposal thread. In fact, the thread title was "[PROPOSAL] >> introduce Python as build-time and run-time dependency for Hadoop and >> throughout Hadoop stack". It is true that only one or two people chose to >> discuss #3 further in that thread. >> >> The point is not just to replace a single script, but to provide a means >> to >> do cross-platform scripts, which will over time replace many >> non-platform-specific scripts written in platform-specific languages. >> >> >> > >> > It seems this vote is dragging much more stuff it was originally >> discussed. >> > I think you should suspend the vote, recap the motivation and then >> restart >> > the vote. >> > >> >> I respectfully disagree. I believe a careful reading of the cited >> discussion thread, plus my own statement of the vote, provides sufficient >> background for a thoughtful decision on the subject. Presumably so do the >> ten other people who had already voted before you made that comment. >> >> If several other people want more discussion first, please speak up. >> Thanks, >> --Matt >> >> As things are laid out at the moment my vote is: >> > >> > -1 (It still seems an overkill to introduce a new runtime requirement >> for >> > building to replace a script.) >> > +1 (I think this is the right way to simplify the build) Alejandro +
Alejandro Abdelnur 2012-11-30, 01:25
-
Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stackMatt Foley 2012-11-30, 02:26
Hello again. Crossed in the mail.
* What kind of tasks you envision Python scripts will enable that are > not possible today? The point isn't to open brave new worlds. The point is to avoid the nightmare of having to maintain multiple "parallel" scripts doing the SAME THING in multiple scripting languages. I know from experience that they never get maintained right. It's just a huge source of bugs, because when they are in different languages, it can be quite difficult to determine that they are *really* doing the same thing. And in a case like shell vs powershell, it will be very common to have contributors who are not experts in both. I care deeply about having a high-quality release in both Linux and Windows. And having a cross-platform scripting language will make it much easier to maintain that quality over time, without "slip" between the two platforms. * Will the requirement of Python be pushed to clients using the > hadoop script? If so, this would affect all downstream projects that use > hadoop script in one why or the other, right? If question #3 passes, then Python will become a run-time dependency for Hadoop. That means it would need to be installed as part of the Hadoop install preparation, just like all the other Hadoop run-time dependencies. Is the main motivation of the proposal to make things easier for window, > so there is no need for cygwin? If that is the case, have you considered > doing directly BAT scripts? If you take Tomcat for example, they have BAT > scripts and SH scripts and things work quite nicely. Of course it is sufficient, from the simple implementation perspective, to translate all the shell scripts into bat or (better) powershell scripts. That is, in fact, the most evident alternative to my proposals #1 and #3. However, I ask -- beg! -- the community to consider it from the software engineering perspective. We aren't here to just implement something once and be done. It has to be maintained, as most of you on this list are well aware, for years and years, across multiple generations. And trying to maintain parallel scripts in multiple languages, when not necessitated by genuine platform-specific requirements, is just creating bug generators in the system. Personally, I wouldn't be trilled to see the logic in the scripts to > get more complex, but on the opposite direction; IMO, scripts should be > trimmed to set env vars (with no voodoo logic), build the classpath (with > no voodoo logic, just from a set of dirs) and call Java. See the first item above. The point is to enable cross-platform scripting of the things we already have to script. IMO, scripts should get out of the env var business entirely, but that's unrelated to this question :-) Finally, this is code change, so I'm not sure why we are doing a vote. I view this as a tools issue, that affects questions that go beyond the one-time choice of how to write (or re-write) saveVersion.sh. Also Aaron (atm) recommended that I bring it to the list. So here we are :-) Cheers, --Matt On Thu, Nov 29, 2012 at 5:25 PM, Alejandro Abdelnur <[EMAIL PROTECTED]>wrote: > Matt, > > Let me repost my previous questions and a few more. I'd appreciate your > answers, as it will help me understand the full impact this would have in > Hadoop and related projects. > > * Phyton as runtime requirement. Are you planing to migrate all BASH > scripts provided by Hadoop (or dynamically created -ie launcher scripts) > to Phyton? > * What else in the current build, besides saveVersion.sh, you see as > candidate to be migrated to Phyton? > * How are you planning to define what Phyton modules can be used? Will > developers have to install them manually? > * What kind of tasks you envision Python scripts will enable that are not > possible today? > * Will the requirement of Python be pushed to clients using the hadoop > script? If so, this would affect all downstream projects that use hadoop > script in one why or the other, right? +
Matt Foley 2012-11-30, 02:26
-
RE: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stackChuan Liu 2012-11-30, 03:22
+1 +1 +1
Agree with Matt on the code maintainability. I think on one side we have Shell which is a script language and OS dependent, e.g. as in bash vs powershell; on the other side we have Java which is not a script language and OS independent. I would accept any script language that can fix the gap as an OS independent scripting language. Personally, I also prefer Python over Ruby. Thanks, Chuan ________________________________________ From: [EMAIL PROTECTED] on behalf of Matt Foley Sent: Thursday, November 29, 2012 6:26 PM To: [EMAIL PROTECTED] Subject: Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stack Hello again. Crossed in the mail. * What kind of tasks you envision Python scripts will enable that are > not possible today? The point isn't to open brave new worlds. The point is to avoid the nightmare of having to maintain multiple "parallel" scripts doing the SAME THING in multiple scripting languages. I know from experience that they never get maintained right. It's just a huge source of bugs, because when they are in different languages, it can be quite difficult to determine that they are *really* doing the same thing. And in a case like shell vs powershell, it will be very common to have contributors who are not experts in both. I care deeply about having a high-quality release in both Linux and Windows. And having a cross-platform scripting language will make it much easier to maintain that quality over time, without "slip" between the two platforms. * Will the requirement of Python be pushed to clients using the > hadoop script? If so, this would affect all downstream projects that use > hadoop script in one why or the other, right? If question #3 passes, then Python will become a run-time dependency for Hadoop. That means it would need to be installed as part of the Hadoop install preparation, just like all the other Hadoop run-time dependencies. Is the main motivation of the proposal to make things easier for window, > so there is no need for cygwin? If that is the case, have you considered > doing directly BAT scripts? If you take Tomcat for example, they have BAT > scripts and SH scripts and things work quite nicely. Of course it is sufficient, from the simple implementation perspective, to translate all the shell scripts into bat or (better) powershell scripts. That is, in fact, the most evident alternative to my proposals #1 and #3. However, I ask -- beg! -- the community to consider it from the software engineering perspective. We aren't here to just implement something once and be done. It has to be maintained, as most of you on this list are well aware, for years and years, across multiple generations. And trying to maintain parallel scripts in multiple languages, when not necessitated by genuine platform-specific requirements, is just creating bug generators in the system. Personally, I wouldn't be trilled to see the logic in the scripts to > get more complex, but on the opposite direction; IMO, scripts should be > trimmed to set env vars (with no voodoo logic), build the classpath (with > no voodoo logic, just from a set of dirs) and call Java. See the first item above. The point is to enable cross-platform scripting of the things we already have to script. IMO, scripts should get out of the env var business entirely, but that's unrelated to this question :-) Finally, this is code change, so I'm not sure why we are doing a vote. I view this as a tools issue, that affects questions that go beyond the one-time choice of how to write (or re-write) saveVersion.sh. Also Aaron (atm) recommended that I bring it to the list. So here we are :-) Cheers, --Matt On Thu, Nov 29, 2012 at 5:25 PM, Alejandro Abdelnur <[EMAIL PROTECTED]>wrote: > Matt, > > Let me repost my previous questions and a few more. I'd appreciate your > answers, as it will help me understand the full impact this would have in > Hadoop and related projects. +
Chuan Liu 2012-11-30, 03:22
-
Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stackBikas Saha 2012-11-30, 04:27
+1, +1, +1 (non-binding)
We have had promising results for 1 and 2 when porting to Windows. 3 would allow us to remove platform dependencies from test code. Agree that there might be some nuanced operations that require OS specific environments but this would lead to keeping them at a minimum. Bikas On 11/29/12 7:22 PM, "Chuan Liu" <[EMAIL PROTECTED]> wrote: >+1 +1 +1 > >Agree with Matt on the code maintainability. > >I think on one side we have Shell which is a script language and OS >dependent, e.g. as in bash vs powershell; >on the other side we have Java which is not a script language and OS >independent. >I would accept any script language that can fix the gap as an OS >independent scripting language. >Personally, I also prefer Python over Ruby. > >Thanks, >Chuan > >________________________________________ >From: [EMAIL PROTECTED] on behalf of Matt Foley >Sent: Thursday, November 29, 2012 6:26 PM >To: [EMAIL PROTECTED] >Subject: Re: [VOTE] introduce Python as build-time and run-time >dependency for Hadoop and throughout Hadoop stack > >Hello again. Crossed in the mail. > >* What kind of tasks you envision Python scripts will enable that are >> not possible today? > > >The point isn't to open brave new worlds. The point is to avoid the >nightmare of having to maintain multiple "parallel" scripts doing the SAME >THING in multiple scripting languages. I know from experience that they >never get maintained right. It's just a huge source of bugs, because when >they are in different languages, it can be quite difficult to determine >that they are *really* doing the same thing. And in a case like shell vs >powershell, it will be very common to have contributors who are not >experts >in both. > >I care deeply about having a high-quality release in both Linux and >Windows. And having a cross-platform scripting language will make it much >easier to maintain that quality over time, without "slip" between the two >platforms. > >* Will the requirement of Python be pushed to clients using the >> hadoop script? If so, this would affect all downstream projects that use >> hadoop script in one why or the other, right? > > >If question #3 passes, then Python will become a run-time dependency for >Hadoop. That means it would need to be installed as part of the Hadoop >install preparation, just like all the other Hadoop run-time dependencies. > >Is the main motivation of the proposal to make things easier for window, >> so there is no need for cygwin? If that is the case, have you considered >> doing directly BAT scripts? If you take Tomcat for example, they have >>BAT >> scripts and SH scripts and things work quite nicely. > > >Of course it is sufficient, from the simple implementation perspective, to >translate all the shell scripts into bat or (better) powershell scripts. > That is, in fact, the most evident alternative to my proposals #1 and #3. > >However, I ask -- beg! -- the community to consider it from the software >engineering perspective. We aren't here to just implement something once >and be done. It has to be maintained, as most of you on this list are >well >aware, for years and years, across multiple generations. And trying to >maintain parallel scripts in multiple languages, when not necessitated by >genuine platform-specific requirements, is just creating bug generators in >the system. > >Personally, I wouldn't be trilled to see the logic in the scripts to >> get more complex, but on the opposite direction; IMO, scripts should be >> trimmed to set env vars (with no voodoo logic), build the classpath >>(with >> no voodoo logic, just from a set of dirs) and call Java. > > >See the first item above. The point is to enable cross-platform scripting >of the things we already have to script. IMO, scripts should get out of >the env var business entirely, but that's unrelated to this question :-) > >Finally, this is code change, so I'm not sure why we are doing a vote. > > >I view this as a tools issue, that affects questions that go beyond the +
Bikas Saha 2012-11-30, 04:27
-
Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stackLuke Lu 2012-11-30, 11:21
Thanks for the voting thread. Otherwise, many committers would have missed
it. I agree that this is a superset of code change that has larger impact than typical code change. On Thu, Nov 29, 2012 at 6:26 PM, Matt Foley <[EMAIL PROTECTED]> wrote: > > Finally, this is code change, so I'm not sure why we are doing a vote. > > > I view this as a tools issue, that affects questions that go beyond the > one-time choice of how to write (or re-write) saveVersion.sh. Also Aaron > (atm) recommended that I bring it to the list. So here we are :-) > +
Luke Lu 2012-11-30, 11:21
-
Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stackLuke Lu 2012-11-30, 12:57
I'd like to change my binding vote to -1, -0, -1.
Considering the hadoop stack/ecosystem as a whole, I think the best cross platform scripting language to adopt is jruby for following reasons: 1. HBase already adopted jruby for HBase shell, which all current platform vendors support. 2. We can control the version of language implementation at a per release basis. 3. We don't have to introduce new dependencies in the de facto hadoop stack. (see 1). I'm all for improving multi-platform support. I think the best way to do this is to have a thin native script wrappers (using env vars) to call the cross-platform jruby scripts. __Luke On Fri, Nov 30, 2012 at 3:21 AM, Luke Lu <[EMAIL PROTECTED]> wrote: > Thanks for the voting thread. Otherwise, many committers would have missed > it. > > I agree that this is a superset of code change that has larger impact than > typical code change. > > > On Thu, Nov 29, 2012 at 6:26 PM, Matt Foley <[EMAIL PROTECTED]> wrote: > >> > Finally, this is code change, so I'm not sure why we are doing a vote. >> >> >> I view this as a tools issue, that affects questions that go beyond the >> one-time choice of how to write (or re-write) saveVersion.sh. Also Aaron >> (atm) recommended that I bring it to the list. So here we are :-) >> > > +
Luke Lu 2012-11-30, 12:57
-
Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stackSteve Loughran 2012-11-30, 13:29
On 30 November 2012 12:57, Luke Lu <[EMAIL PROTECTED]> wrote:
> I'd like to change my binding vote to -1, -0, -1. > > Considering the hadoop stack/ecosystem as a whole, I think the best cross > platform scripting language to adopt is jruby for following reasons: > > 1. HBase already adopted jruby for HBase shell, which all current platform > vendors support. > 2. We can control the version of language implementation at a per release > basis. > 3. We don't have to introduce new dependencies in the de facto hadoop > stack. (see 1). > > I don't see why these arguments should have any impact on using python at build time, as it doesn't introduce any dependencies downstream. Yes, you need python at build time, but that's no worse than having a protoc compiler, gcc and the automake toolchain. > I'm all for improving multi-platform support. I think the best way to do > this is to have a thin native script wrappers (using env vars) to call the > cross-platform jruby scripts. > > Were it not for the env-var configuration hierarchy mess that things are in today, I'd agree. where do you set your env vars? hadoop-env.sh? Where does that come from? the hadoop conf dir? How do you find that? An env variable or a ../../conf from bin/hadoop.sh which breaks once you start symlinking to hadoop/bin; or do you assume a root installation in /etc/hadoop/conf, which points to /etc/alternatives/hadoop-conf, which can then point back to /etc/hadoop/conf.pseudo ? And what about JAVA_HOME? Those env vars are something I'd like see the back of. +
Steve Loughran 2012-11-30, 13:29
-
Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stackLuke Lu 2012-11-30, 14:02
On Fri, Nov 30, 2012 at 5:29 AM, Steve Loughran <[EMAIL PROTECTED]>wrote:
> Yes, you need python at build time, but that's no worse than having a > protoc > compiler, gcc and the automake toolchain. > The problem is that python is known to have _backward_ compatibility issues on various platforms. It would be very annoying/time consuming to deal with various support issues regarding python versions on various platforms. I agree that autotools is a nightmare and should be converted (in branch-1 as well) to cmake (which has good versioning support :) The goal is to have less external dependencies, not more, again mostly due to support issues. If we want to introduce an external dependencies, we need to pick something that are easy to support compatibility wise. __Luke +
Luke Lu 2012-11-30, 14:02
-
Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stackLuke Lu 2012-11-30, 13:49
On Fri, Nov 30, 2012 at 5:29 AM, Steve Loughran <[EMAIL PROTECTED]>wrote:
> where do you set your env vars... and what about JAVA_HOME > There should be only two env vars (JAVA_HOME and HADOOP_HOME) to deal with in the native scripts (.bat on windows and .sh on unix platforms) to boostrap jruby scripts, which deal with the rest of the envs. __Luke +
Luke Lu 2012-11-30, 13:49
-
Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stackArun C Murthy 2012-12-02, 18:20
On Nov 29, 2012, at 6:26 PM, Matt Foley wrote: > Hello again. Crossed in the mail. > > * What kind of tasks you envision Python scripts will enable that are >> not possible today? > > > The point isn't to open brave new worlds. The point is to avoid the > nightmare of having to maintain multiple "parallel" scripts doing the SAME > THING in multiple scripting languages. +1, +1, +1 Couldn't agree more, I don't want to be in the business of having the same logic in multiple platform-specific scripts - doesn't make any sense. Arun +
Arun C Murthy 2012-12-02, 18:20
-
Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stackRadim Kolar 2012-11-30, 00:29
* What else in the current build, besides saveVersion.sh, you see as candidate to be migrated to Phyton? inline ant scripts +
Radim Kolar 2012-11-30, 00:29
-
Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stackSteve Loughran 2012-11-30, 13:20
On 30 November 2012 00:29, Radim Kolar <[EMAIL PROTECTED]> wrote:
> > * What else in the current build, besides saveVersion.sh, you see as > candidate to be migrated to Phyton? > > inline ant scripts > =0. Ant's versioning is stricter; you can pull down the exact Jar versions, and some of us in the Ant team worked very hard to get it going everywhere. You don't gain anything by going to .py -steve +
Steve Loughran 2012-11-30, 13:20
-
Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stackRadim Kolar 2012-11-30, 13:40
>> inline ant scripts >> >> =0. Ant's versioning is stricter; you can pull down the exact Jar versions, >> and some of us in the Ant team worked very hard to get it going everywhere. >> You don't gain anything by going to .py there are sh scripts inside maven ant plugin stuff +
Radim Kolar 2012-11-30, 13:40
-
Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stackJitendra Pandey 2012-11-30, 22:49
+1, +1, +1
On Fri, Nov 30, 2012 at 5:40 AM, Radim Kolar <[EMAIL PROTECTED]> wrote: > > inline ant scripts >>> >>> =0. Ant's versioning is stricter; you can pull down the exact Jar >>> versions, >>> and some of us in the Ant team worked very hard to get it going >>> everywhere. >>> You don't gain anything by going to .py >>> >> there are sh scripts inside maven ant plugin stuff > -- <http://hortonworks.com/download/> +
Jitendra Pandey 2012-11-30, 22:49
-
Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stackSteve Loughran 2012-12-01, 10:48
On 30 November 2012 13:40, Radim Kolar <[EMAIL PROTECTED]> wrote:
> > inline ant scripts >>> >>> =0. Ant's versioning is stricter; you can pull down the exact Jar >>> versions, >>> and some of us in the Ant team worked very hard to get it going >>> everywhere. >>> You don't gain anything by going to .py >>> >> there are sh scripts inside maven ant plugin stuff > Which is because there are some things you can't do in Java -run rpmbuild to pick up file permissions and hanging symlinks that only become valid on deployment. The reason Ant is used to start them is Maven views trying to run native scripts as a forbidden action - probably popping up some patronising text "you are trying to run a shell script, please look at maven.apache.org/wiki/whymavenwontletyoudothings/ to understand this; they also view building RPMs as not something to encourage either. (but we digress into an ant vs maven argument. I do actually appreciate the consistent target naming across projects and the ability for the IDE to set up structure, it's just the entire underlying architecture and implementation that I dislike) +
Steve Loughran 2012-12-01, 10:48
-
[VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stackMatt Foley 2012-11-24, 20:13
For discussion, please see previous thread "[PROPOSAL] introduce Python as
build-time and run-time dependency for Hadoop and throughout Hadoop stack". This vote consists of three separate items: 1. Contributors shall be allowed to use Python as a platform-independent scripting language for build-time tasks, and add Python as a build-time dependency. Please vote +1, 0, -1. 2. Contributors shall be encouraged to use Maven tasks in combination with either plug-ins or Groovy scripts to do cross-platform build-time tasks, even under ant in Hadoop-1. Please vote +1, 0, -1. 3. Contributors shall be allowed to use Python as a platform-independent scripting language for run-time tasks, and add Python as a run-time dependency. Please vote +1, 0, -1. Note that voting -1 on #1 and +1 on #2 essentially REQUIRES contributors to use Maven plug-ins or Groovy as the only means of cross-platform build-time tasks, or to simply continue using platform-dependent scripts as is being done today. Vote closes at 12:30pm PST on Saturday 1 December. --------- Personally, my vote is +1, +1, +1. I think #2 is preferable to #1, but still has many unknowns in it, and until those are worked out I don't want to delay moving to cross-platform scripts for build-time tasks. Best regards, --Matt +
Matt Foley 2012-11-24, 20:13
-
RE: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stackIvan Mitic 2012-11-29, 23:41
+1, +1, +1 (some comments inline)
-----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of Matt Foley Sent: Saturday, November 24, 2012 12:13 PM To: [EMAIL PROTECTED] Subject: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stack For discussion, please see previous thread "[PROPOSAL] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stack". This vote consists of three separate items: 1. Contributors shall be allowed to use Python as a platform-independent scripting language for build-time tasks, and add Python as a build-time dependency. Please vote +1, 0, -1. 2. Contributors shall be encouraged to use Maven tasks in combination with either plug-ins or Groovy scripts to do cross-platform build-time tasks, even under ant in Hadoop-1. Please vote +1, 0, -1. >>> I believe 1&2 in combination make a total sense. I ported a few scripts to Python, and thus far, it showed to be up to the task and satisfy the cross-platform requirements. In my option, it is also important to agree on the version, as I've run into some breaking changes in version 3+. 3. Contributors shall be allowed to use Python as a platform-independent scripting language for run-time tasks, and add Python as a run-time dependency. >>> This is a great aspirational goal! Maintaining two sets of scripts would be a real challenge. Please vote +1, 0, -1. Note that voting -1 on #1 and +1 on #2 essentially REQUIRES contributors to use Maven plug-ins or Groovy as the only means of cross-platform build-time tasks, or to simply continue using platform-dependent scripts as is being done today. Vote closes at 12:30pm PST on Saturday 1 December. --------- Personally, my vote is +1, +1, +1. I think #2 is preferable to #1, but still has many unknowns in it, and until those are worked out I don't want to delay moving to cross-platform scripts for build-time tasks. Best regards, --Matt +
Ivan Mitic 2012-11-29, 23:41
-
RE: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stackMahadevan Venkatraman 2012-11-30, 02:07
+1, +1, +1 (non-binding)
Supporting Comments: Build-time scripts: Using a platform independent language such as python (or maven in certain cases) will greatly help in reducing build breaks and improve on build script maintainability. Run-time scripts: Most run-time scripts are end-user visible and are scripts that are needed to be run by admin such as starting/stop Hadoop cluster (hadoop-daemons) or by developers submitting a job (hadoop.cmd). There seem to be two types of script files: - Scripts intended for a cluster admin or an IT admin: - It is desirable to use a common set of python scripts that work across all platforms. However, in a Windows enterprise environment IT admins won't like it if they have to run python scripts to start/stop a cluster. So for these, there should be a PowerShell interface wrapper that can accept the right parameters and pass it down to the python script. Hopefully, the power-shell layer can be a simple pass-thru. This way the python scripts is like any other Java code hidden behind a well-known API surface. IT Admins can't debug it or modify it easily, but this is fine since for scripts like the aforementioned there isn't a requirement that IT Admins should be able to easily be able to view/modify the underlying code. - For Windows specific things not supported by Python natively, such as setting ACLs, starting/stopping windows services it should be possible to re-factor the code appropriately. But a little bit of powershell/cmd for these call outs would be unavoidable. - Scripts intended for developers/cluster users: - Most of these scripts (e.g. hadoop.cmd) would be behind other API surface such as WebHDFS, ODBC, JDBC, Templeton etc. So the advantage of having a common script across platforms outweighs the use of cmd/powershell as a native windows feature. Again, it should also be possible to provide simple powershell wrappers for a windows environment. Thanks, Mahadevan. -----Original Message----- From: Ivan Mitic [mailto:[EMAIL PROTECTED]] Sent: Thursday, November 29, 2012 3:41 PM To: [EMAIL PROTECTED]; [EMAIL PROTECTED] Subject: RE: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stack +1, +1, +1 (some comments inline) -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of Matt Foley Sent: Saturday, November 24, 2012 12:13 PM To: [EMAIL PROTECTED] Subject: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stack For discussion, please see previous thread "[PROPOSAL] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stack". This vote consists of three separate items: 1. Contributors shall be allowed to use Python as a platform-independent scripting language for build-time tasks, and add Python as a build-time dependency. Please vote +1, 0, -1. 2. Contributors shall be encouraged to use Maven tasks in combination with either plug-ins or Groovy scripts to do cross-platform build-time tasks, even under ant in Hadoop-1. Please vote +1, 0, -1. >>> I believe 1&2 in combination make a total sense. I ported a few scripts to Python, and thus far, it showed to be up to the task and satisfy the cross-platform requirements. In my option, it is also important to agree on the version, as I've run into some breaking changes in version 3+. 3. Contributors shall be allowed to use Python as a platform-independent scripting language for run-time tasks, and add Python as a run-time dependency. >>> This is a great aspirational goal! Maintaining two sets of scripts would be a real challenge. Please vote +1, 0, -1. Note that voting -1 on #1 and +1 on #2 essentially REQUIRES contributors to use Maven plug-ins or Groovy as the only means of cross-platform build-time tasks, or to simply continue using platform-dependent scripts as is being done today. Vote closes at 12Personally, my vote is +1, +1, +1. I think #2 is preferable to #1, but still has many unknowns in it, and until those are worked out I don't want to delay moving to cross-platform scripts for build-time tasks. Best regards, +
Mahadevan Venkatraman 2012-11-30, 02:07
-
Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stackRaja Aluri 2012-12-01, 00:57
+1, +1, +1 (non binding)
It makes it a lot easier to make build tools (that cannot be developed easily using maven) work across non-unix like platforms (especially windows). Raja On Sat, Nov 24, 2012 at 12:13 PM, Matt Foley <[EMAIL PROTECTED]> wrote: > For discussion, please see previous thread "[PROPOSAL] introduce Python as > build-time and run-time dependency for Hadoop and throughout Hadoop stack". > > This vote consists of three separate items: > > 1. Contributors shall be allowed to use Python as a platform-independent > scripting language for build-time tasks, and add Python as a build-time > dependency. > Please vote +1, 0, -1. > > 2. Contributors shall be encouraged to use Maven tasks in combination with > either plug-ins or Groovy scripts to do cross-platform build-time tasks, > even under ant in Hadoop-1. > Please vote +1, 0, -1. > > 3. Contributors shall be allowed to use Python as a platform-independent > scripting language for run-time tasks, and add Python as a run-time > dependency. > Please vote +1, 0, -1. > > Note that voting -1 on #1 and +1 on #2 essentially REQUIRES contributors to > use Maven plug-ins or Groovy as the only means of cross-platform build-time > tasks, or to simply continue using platform-dependent scripts as is being > done today. > > Vote closes at 12:30pm PST on Saturday 1 December. > --------- > Personally, my vote is +1, +1, +1. > I think #2 is preferable to #1, but still has many unknowns in it, and > until those are worked out I don't want to delay moving to cross-platform > scripts for build-time tasks. > > Best regards, > --Matt > +
Raja Aluri 2012-12-01, 00:57
-
Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stackEli Collins 2012-12-01, 01:08
-1, 0, -1
IIUC the only platform we plan to add support for that we can't easily support today (w/o an emulation layer like cygwin) is Windows, and it seems like making the bash scripts simpler and having parallel bat files is IMO a better approach. On Sat, Nov 24, 2012 at 12:13 PM, Matt Foley <[EMAIL PROTECTED]> wrote: > For discussion, please see previous thread "[PROPOSAL] introduce Python as > build-time and run-time dependency for Hadoop and throughout Hadoop stack". > > This vote consists of three separate items: > > 1. Contributors shall be allowed to use Python as a platform-independent > scripting language for build-time tasks, and add Python as a build-time > dependency. > Please vote +1, 0, -1. > > 2. Contributors shall be encouraged to use Maven tasks in combination with > either plug-ins or Groovy scripts to do cross-platform build-time tasks, > even under ant in Hadoop-1. > Please vote +1, 0, -1. > > 3. Contributors shall be allowed to use Python as a platform-independent > scripting language for run-time tasks, and add Python as a run-time > dependency. > Please vote +1, 0, -1. > > Note that voting -1 on #1 and +1 on #2 essentially REQUIRES contributors to > use Maven plug-ins or Groovy as the only means of cross-platform build-time > tasks, or to simply continue using platform-dependent scripts as is being > done today. > > Vote closes at 12:30pm PST on Saturday 1 December. > --------- > Personally, my vote is +1, +1, +1. > I think #2 is preferable to #1, but still has many unknowns in it, and > until those are worked out I don't want to delay moving to cross-platform > scripts for build-time tasks. > > Best regards, > --Matt +
Eli Collins 2012-12-01, 01:08
-
Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stackSteve Loughran 2012-12-01, 10:44
On 1 December 2012 01:08, Eli Collins <[EMAIL PROTECTED]> wrote:
> -1, 0, -1 > > IIUC the only platform we plan to add support for that we can't easily > support today (w/o an emulation layer like cygwin) is Windows, and it > seems like making the bash scripts simpler and having parallel bat > files is IMO a better approach. > > WinNT Bat/CMD files are the worst possible scripting language invented. At the very least, .py should be the language of choice there +
Steve Loughran 2012-12-01, 10:44
-
Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stackDoug Cutting 2012-12-01, 18:23
On Sat, Dec 1, 2012 at 2:44 AM, Steve Loughran <[EMAIL PROTECTED]> wrote:
> WinNT Bat/CMD files are the worst possible scripting language invented. At > the very least, .py should be the language of choice there The scripts should not have so much logic that .bat files are a problem. Doug +
Doug Cutting 2012-12-01, 18:23
-
Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stackKonstantin Boudnik 2012-12-13, 00:53
On Sat, Dec 01, 2012 at 10:44AM, Steve Loughran wrote:
> On 1 December 2012 01:08, Eli Collins <[EMAIL PROTECTED]> wrote: > > > -1, 0, -1 > > > > IIUC the only platform we plan to add support for that we can't easily > > support today (w/o an emulation layer like cygwin) is Windows, and it > > seems like making the bash scripts simpler and having parallel bat > > files is IMO a better approach. > > > > > WinNT Bat/CMD files are the worst possible scripting language invented. At > the very least, .py should be the language of choice there Compare to the OS in question - it isn't _that_ bad ;) +
Konstantin Boudnik 2012-12-13, 00:53
-
Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stackDoug Cutting 2012-11-30, 16:55
-1, +1, -1
Run- & build-time scripting should be limited to operations that are impossible in Java. These should not be complex nor should we encourage more complexity in them. A parallel set of simple .bat files for such operations seems preferable to adding a Python dependency. Doug On Sat, Nov 24, 2012 at 12:13 PM, Matt Foley <[EMAIL PROTECTED]> wrote: > For discussion, please see previous thread "[PROPOSAL] introduce Python as > build-time and run-time dependency for Hadoop and throughout Hadoop stack". > > This vote consists of three separate items: > > 1. Contributors shall be allowed to use Python as a platform-independent > scripting language for build-time tasks, and add Python as a build-time > dependency. > Please vote +1, 0, -1. > > 2. Contributors shall be encouraged to use Maven tasks in combination with > either plug-ins or Groovy scripts to do cross-platform build-time tasks, > even under ant in Hadoop-1. > Please vote +1, 0, -1. > > 3. Contributors shall be allowed to use Python as a platform-independent > scripting language for run-time tasks, and add Python as a run-time > dependency. > Please vote +1, 0, -1. > > Note that voting -1 on #1 and +1 on #2 essentially REQUIRES contributors to > use Maven plug-ins or Groovy as the only means of cross-platform build-time > tasks, or to simply continue using platform-dependent scripts as is being > done today. > > Vote closes at 12:30pm PST on Saturday 1 December. > --------- > Personally, my vote is +1, +1, +1. > I think #2 is preferable to #1, but still has many unknowns in it, and > until those are worked out I don't want to delay moving to cross-platform > scripts for build-time tasks. > > Best regards, > --Matt +
Doug Cutting 2012-11-30, 16:55
-
Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stackJoep Rottinghuis 2012-12-01, 20:28
0, 0, -1 (non-binding)
Joep On Nov 24, 2012, at 12:13 PM, Matt Foley <[EMAIL PROTECTED]> wrote: > For discussion, please see previous thread "[PROPOSAL] introduce Python as > build-time and run-time dependency for Hadoop and throughout Hadoop stack". > > This vote consists of three separate items: > > 1. Contributors shall be allowed to use Python as a platform-independent > scripting language for build-time tasks, and add Python as a build-time > dependency. > Please vote +1, 0, -1. > > 2. Contributors shall be encouraged to use Maven tasks in combination with > either plug-ins or Groovy scripts to do cross-platform build-time tasks, > even under ant in Hadoop-1. > Please vote +1, 0, -1. > > 3. Contributors shall be allowed to use Python as a platform-independent > scripting language for run-time tasks, and add Python as a run-time > dependency. > Please vote +1, 0, -1. > > Note that voting -1 on #1 and +1 on #2 essentially REQUIRES contributors to > use Maven plug-ins or Groovy as the only means of cross-platform build-time > tasks, or to simply continue using platform-dependent scripts as is being > done today. > > Vote closes at 12:30pm PST on Saturday 1 December. > --------- > Personally, my vote is +1, +1, +1. > I think #2 is preferable to #1, but still has many unknowns in it, and > until those are worked out I don't want to delay moving to cross-platform > scripts for build-time tasks. > > Best regards, > --Matt +
Joep Rottinghuis 2012-12-01, 20:28
-
Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stackEric Yang 2012-12-02, 06:07
-1, +1, -1
Python has fairly inconsistent support across all major OS vendors. It is hard to get it right unless the scripts are all designed to make use of Python 2.4. However, Python 2.4 doesn't have necessary OS features to make Python useful in runtime or build environment unless you write a lot of custom modules. Which defeats the purpose to use python as intermediate layer to do OS dependent work. Jruby may be a better choice. regards, Eric On Sat, Dec 1, 2012 at 12:28 PM, Joep Rottinghuis <[EMAIL PROTECTED]>wrote: > 0, 0, -1 (non-binding) > > Joep > > On Nov 24, 2012, at 12:13 PM, Matt Foley <[EMAIL PROTECTED]> wrote: > > > For discussion, please see previous thread "[PROPOSAL] introduce Python > as > > build-time and run-time dependency for Hadoop and throughout Hadoop > stack". > > > > This vote consists of three separate items: > > > > 1. Contributors shall be allowed to use Python as a platform-independent > > scripting language for build-time tasks, and add Python as a build-time > > dependency. > > Please vote +1, 0, -1. > > > > 2. Contributors shall be encouraged to use Maven tasks in combination > with > > either plug-ins or Groovy scripts to do cross-platform build-time tasks, > > even under ant in Hadoop-1. > > Please vote +1, 0, -1. > > > > 3. Contributors shall be allowed to use Python as a platform-independent > > scripting language for run-time tasks, and add Python as a run-time > > dependency. > > Please vote +1, 0, -1. > > > > Note that voting -1 on #1 and +1 on #2 essentially REQUIRES contributors > to > > use Maven plug-ins or Groovy as the only means of cross-platform > build-time > > tasks, or to simply continue using platform-dependent scripts as is being > > done today. > > > > Vote closes at 12:30pm PST on Saturday 1 December. > > --------- > > Personally, my vote is +1, +1, +1. > > I think #2 is preferable to #1, but still has many unknowns in it, and > > until those are worked out I don't want to delay moving to cross-platform > > scripts for build-time tasks. > > > > Best regards, > > --Matt > +
Eric Yang 2012-12-02, 06:07
-
Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stackKonstantin Boudnik 2012-12-13, 00:55
On Sat, Dec 01, 2012 at 10:07PM, Eric Yang wrote:
> -1, +1, -1 > > Python has fairly inconsistent support across all major OS vendors. It is > hard to get it right unless the scripts are all designed to make use of > Python 2.4. However, Python 2.4 doesn't have necessary OS features to make > Python useful in runtime or build environment unless you write a lot of > custom modules. Which defeats the purpose to use python as intermediate > layer to do OS dependent work. Jruby may be a better choice. JRuby? Really? Groovy is already there and it is really a Java dialect unlike JRuby. And yes - it is quite suitable for build things, considering the use of it in BigTop Cos > On Sat, Dec 1, 2012 at 12:28 PM, Joep Rottinghuis <[EMAIL PROTECTED]>wrote: > > > 0, 0, -1 (non-binding) > > > > Joep > > > > On Nov 24, 2012, at 12:13 PM, Matt Foley <[EMAIL PROTECTED]> wrote: > > > > > For discussion, please see previous thread "[PROPOSAL] introduce Python > > as > > > build-time and run-time dependency for Hadoop and throughout Hadoop > > stack". > > > > > > This vote consists of three separate items: > > > > > > 1. Contributors shall be allowed to use Python as a platform-independent > > > scripting language for build-time tasks, and add Python as a build-time > > > dependency. > > > Please vote +1, 0, -1. > > > > > > 2. Contributors shall be encouraged to use Maven tasks in combination > > with > > > either plug-ins or Groovy scripts to do cross-platform build-time tasks, > > > even under ant in Hadoop-1. > > > Please vote +1, 0, -1. > > > > > > 3. Contributors shall be allowed to use Python as a platform-independent > > > scripting language for run-time tasks, and add Python as a run-time > > > dependency. > > > Please vote +1, 0, -1. > > > > > > Note that voting -1 on #1 and +1 on #2 essentially REQUIRES contributors > > to > > > use Maven plug-ins or Groovy as the only means of cross-platform > > build-time > > > tasks, or to simply continue using platform-dependent scripts as is being > > > done today. > > > > > > Vote closes at 12:30pm PST on Saturday 1 December. > > > --------- > > > Personally, my vote is +1, +1, +1. > > > I think #2 is preferable to #1, but still has many unknowns in it, and > > > until those are worked out I don't want to delay moving to cross-platform > > > scripts for build-time tasks. > > > > > > Best regards, > > > --Matt > > +
Konstantin Boudnik 2012-12-13, 00:55
-
Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stackTom White 2012-12-03, 14:23
+1, +1, -1
Tom On Sat, Nov 24, 2012 at 8:13 PM, Matt Foley <[EMAIL PROTECTED]> wrote: > For discussion, please see previous thread "[PROPOSAL] introduce Python as > build-time and run-time dependency for Hadoop and throughout Hadoop stack". > > This vote consists of three separate items: > > 1. Contributors shall be allowed to use Python as a platform-independent > scripting language for build-time tasks, and add Python as a build-time > dependency. > Please vote +1, 0, -1. > > 2. Contributors shall be encouraged to use Maven tasks in combination with > either plug-ins or Groovy scripts to do cross-platform build-time tasks, > even under ant in Hadoop-1. > Please vote +1, 0, -1. > > 3. Contributors shall be allowed to use Python as a platform-independent > scripting language for run-time tasks, and add Python as a run-time > dependency. > Please vote +1, 0, -1. > > Note that voting -1 on #1 and +1 on #2 essentially REQUIRES contributors to > use Maven plug-ins or Groovy as the only means of cross-platform build-time > tasks, or to simply continue using platform-dependent scripts as is being > done today. > > Vote closes at 12:30pm PST on Saturday 1 December. > --------- > Personally, my vote is +1, +1, +1. > I think #2 is preferable to #1, but still has many unknowns in it, and > until those are worked out I don't want to delay moving to cross-platform > scripts for build-time tasks. > > Best regards, > --Matt +
Tom White 2012-12-03, 14:23
-
Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stackChris Nauroth 2012-11-25, 07:18
+1, +1, +1 (non-binding)
On Sat, Nov 24, 2012 at 12:13 PM, Matt Foley <[EMAIL PROTECTED]> wrote: > For discussion, please see previous thread "[PROPOSAL] introduce Python as > build-time and run-time dependency for Hadoop and throughout Hadoop stack". > > This vote consists of three separate items: > > 1. Contributors shall be allowed to use Python as a platform-independent > scripting language for build-time tasks, and add Python as a build-time > dependency. > Please vote +1, 0, -1. > > 2. Contributors shall be encouraged to use Maven tasks in combination with > either plug-ins or Groovy scripts to do cross-platform build-time tasks, > even under ant in Hadoop-1. > Please vote +1, 0, -1. > > 3. Contributors shall be allowed to use Python as a platform-independent > scripting language for run-time tasks, and add Python as a run-time > dependency. > Please vote +1, 0, -1. > > Note that voting -1 on #1 and +1 on #2 essentially REQUIRES contributors to > use Maven plug-ins or Groovy as the only means of cross-platform build-time > tasks, or to simply continue using platform-dependent scripts as is being > done today. > > Vote closes at 12:30pm PST on Saturday 1 December. > --------- > Personally, my vote is +1, +1, +1. > I think #2 is preferable to #1, but still has many unknowns in it, and > until those are worked out I don't want to delay moving to cross-platform > scripts for build-time tasks. > > Best regards, > --Matt > +
Chris Nauroth 2012-11-25, 07:18
-
Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stackSuresh Srinivas 2012-11-26, 20:41
+1, +1, +1
Regards, Suresh On Sat, Nov 24, 2012 at 12:13 PM, Matt Foley <[EMAIL PROTECTED]> wrote: > For discussion, please see previous thread "[PROPOSAL] introduce Python as > build-time and run-time dependency for Hadoop and throughout Hadoop stack". > > This vote consists of three separate items: > > 1. Contributors shall be allowed to use Python as a platform-independent > scripting language for build-time tasks, and add Python as a build-time > dependency. > Please vote +1, 0, -1. > > 2. Contributors shall be encouraged to use Maven tasks in combination with > either plug-ins or Groovy scripts to do cross-platform build-time tasks, > even under ant in Hadoop-1. > Please vote +1, 0, -1. > > 3. Contributors shall be allowed to use Python as a platform-independent > scripting language for run-time tasks, and add Python as a run-time > dependency. > Please vote +1, 0, -1. > > Note that voting -1 on #1 and +1 on #2 essentially REQUIRES contributors to > use Maven plug-ins or Groovy as the only means of cross-platform build-time > tasks, or to simply continue using platform-dependent scripts as is being > done today. > > Vote closes at 12:30pm PST on Saturday 1 December. > --------- > Personally, my vote is +1, +1, +1. > I think #2 is preferable to #1, but still has many unknowns in it, and > until those are worked out I don't want to delay moving to cross-platform > scripts for build-time tasks. > > Best regards, > --Matt > -- http://hortonworks.com/download/ +
Suresh Srinivas 2012-11-26, 20:41
-
Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stackKonstantin Boudnik 2012-11-26, 18:30
-1, +1, -1
Thanks On Sat, Nov 24, 2012 at 12:13PM, Matt Foley wrote: > For discussion, please see previous thread "[PROPOSAL] introduce Python as > build-time and run-time dependency for Hadoop and throughout Hadoop stack". > > This vote consists of three separate items: > > 1. Contributors shall be allowed to use Python as a platform-independent > scripting language for build-time tasks, and add Python as a build-time > dependency. > Please vote +1, 0, -1. > > 2. Contributors shall be encouraged to use Maven tasks in combination with > either plug-ins or Groovy scripts to do cross-platform build-time tasks, > even under ant in Hadoop-1. > Please vote +1, 0, -1. > > 3. Contributors shall be allowed to use Python as a platform-independent > scripting language for run-time tasks, and add Python as a run-time > dependency. > Please vote +1, 0, -1. > > Note that voting -1 on #1 and +1 on #2 essentially REQUIRES contributors to > use Maven plug-ins or Groovy as the only means of cross-platform build-time > tasks, or to simply continue using platform-dependent scripts as is being > done today. > > Vote closes at 12:30pm PST on Saturday 1 December. > --------- > Personally, my vote is +1, +1, +1. > I think #2 is preferable to #1, but still has many unknowns in it, and > until those are worked out I don't want to delay moving to cross-platform > scripts for build-time tasks. > > Best regards, > --Matt +
Konstantin Boudnik 2012-11-26, 18:30
-
Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stackRadim Kolar 2012-11-26, 17:34
-1, +1, -1
+
Radim Kolar 2012-11-26, 17:34
-
Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stackColin McCabe 2012-11-26, 16:53
Nonbinding, but:
+1, +1, 0. Also, let's please clearly define the versions of Python we support if we do chooes to go this route. Something like 2.4+ would be reasonable. The process launching APIs in particular changed a lot in those early 2.x releases. best, Colin On Sat, Nov 24, 2012 at 12:13 PM, Matt Foley <[EMAIL PROTECTED]> wrote: > For discussion, please see previous thread "[PROPOSAL] introduce Python as > build-time and run-time dependency for Hadoop and throughout Hadoop stack". > > This vote consists of three separate items: > > 1. Contributors shall be allowed to use Python as a platform-independent > scripting language for build-time tasks, and add Python as a build-time > dependency. > Please vote +1, 0, -1. > > 2. Contributors shall be encouraged to use Maven tasks in combination with > either plug-ins or Groovy scripts to do cross-platform build-time tasks, > even under ant in Hadoop-1. > Please vote +1, 0, -1. > > 3. Contributors shall be allowed to use Python as a platform-independent > scripting language for run-time tasks, and add Python as a run-time > dependency. > Please vote +1, 0, -1. > > Note that voting -1 on #1 and +1 on #2 essentially REQUIRES contributors to > use Maven plug-ins or Groovy as the only means of cross-platform build-time > tasks, or to simply continue using platform-dependent scripts as is being > done today. > > Vote closes at 12:30pm PST on Saturday 1 December. > --------- > Personally, my vote is +1, +1, +1. > I think #2 is preferable to #1, but still has many unknowns in it, and > until those are worked out I don't want to delay moving to cross-platform > scripts for build-time tasks. > > Best regards, > --Matt +
Colin McCabe 2012-11-26, 16:53
-
Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stackChris Nauroth 2012-11-26, 17:44
Declaring 2.4 to be the minimum supported version sounds like a great idea.
I've worked with CentOS distributions that have a dependency on Python 2.4, and it was always awkward to get a later version on those machines. Thank you, --Chris On Mon, Nov 26, 2012 at 8:53 AM, Colin McCabe <[EMAIL PROTECTED]>wrote: > Nonbinding, but: > > +1, +1, 0. > > Also, let's please clearly define the versions of Python we support if > we do chooes to go this route. Something like 2.4+ would be > reasonable. The process launching APIs in particular changed a lot in > those early 2.x releases. > > best, > Colin > > > On Sat, Nov 24, 2012 at 12:13 PM, Matt Foley <[EMAIL PROTECTED]> wrote: > > For discussion, please see previous thread "[PROPOSAL] introduce Python > as > > build-time and run-time dependency for Hadoop and throughout Hadoop > stack". > > > > This vote consists of three separate items: > > > > 1. Contributors shall be allowed to use Python as a platform-independent > > scripting language for build-time tasks, and add Python as a build-time > > dependency. > > Please vote +1, 0, -1. > > > > 2. Contributors shall be encouraged to use Maven tasks in combination > with > > either plug-ins or Groovy scripts to do cross-platform build-time tasks, > > even under ant in Hadoop-1. > > Please vote +1, 0, -1. > > > > 3. Contributors shall be allowed to use Python as a platform-independent > > scripting language for run-time tasks, and add Python as a run-time > > dependency. > > Please vote +1, 0, -1. > > > > Note that voting -1 on #1 and +1 on #2 essentially REQUIRES contributors > to > > use Maven plug-ins or Groovy as the only means of cross-platform > build-time > > tasks, or to simply continue using platform-dependent scripts as is being > > done today. > > > > Vote closes at 12:30pm PST on Saturday 1 December. > > --------- > > Personally, my vote is +1, +1, +1. > > I think #2 is preferable to #1, but still has many unknowns in it, and > > until those are worked out I don't want to delay moving to cross-platform > > scripts for build-time tasks. > > > > Best regards, > > --Matt > +
Chris Nauroth 2012-11-26, 17:44
-
Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stackLuke Lu 2012-11-26, 17:25
-1, +1, -1.
If we want to introduce a "platform independent" scripting language, we should not choose python, as it has a bad track record for compatibility (between versions/platforms). +1 to use groovy, as we can control the version of groovy jars included in our distribution. __Luke On Mon, Nov 26, 2012 at 8:53 AM, Colin McCabe <[EMAIL PROTECTED]>wrote: > Nonbinding, but: > > +1, +1, 0. > > Also, let's please clearly define the versions of Python we support if > we do chooes to go this route. Something like 2.4+ would be > reasonable. The process launching APIs in particular changed a lot in > those early 2.x releases. > > best, > Colin > > > On Sat, Nov 24, 2012 at 12:13 PM, Matt Foley <[EMAIL PROTECTED]> wrote: > > For discussion, please see previous thread "[PROPOSAL] introduce Python > as > > build-time and run-time dependency for Hadoop and throughout Hadoop > stack". > > > > This vote consists of three separate items: > > > > 1. Contributors shall be allowed to use Python as a platform-independent > > scripting language for build-time tasks, and add Python as a build-time > > dependency. > > Please vote +1, 0, -1. > > > > 2. Contributors shall be encouraged to use Maven tasks in combination > with > > either plug-ins or Groovy scripts to do cross-platform build-time tasks, > > even under ant in Hadoop-1. > > Please vote +1, 0, -1. > > > > 3. Contributors shall be allowed to use Python as a platform-independent > > scripting language for run-time tasks, and add Python as a run-time > > dependency. > > Please vote +1, 0, -1. > > > > Note that voting -1 on #1 and +1 on #2 essentially REQUIRES contributors > to > > use Maven plug-ins or Groovy as the only means of cross-platform > build-time > > tasks, or to simply continue using platform-dependent scripts as is being > > done today. > > > > Vote closes at 12:30pm PST on Saturday 1 December. > > --------- > > Personally, my vote is +1, +1, +1. > > I think #2 is preferable to #1, but still has many unknowns in it, and > > until those are worked out I don't want to delay moving to cross-platform > > scripts for build-time tasks. > > > > Best regards, > > --Matt > +
Luke Lu 2012-11-26, 17:25
-
Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stackGiridharan Kesavan 2012-11-26, 21:16
+1, +1, +1
-Giri On Sat, Nov 24, 2012 at 12:13 PM, Matt Foley <[EMAIL PROTECTED]> wrote: > For discussion, please see previous thread "[PROPOSAL] introduce Python as > build-time and run-time dependency for Hadoop and throughout Hadoop stack". > > This vote consists of three separate items: > > 1. Contributors shall be allowed to use Python as a platform-independent > scripting language for build-time tasks, and add Python as a build-time > dependency. > Please vote +1, 0, -1. > > 2. Contributors shall be encouraged to use Maven tasks in combination with > either plug-ins or Groovy scripts to do cross-platform build-time tasks, > even under ant in Hadoop-1. > Please vote +1, 0, -1. > > 3. Contributors shall be allowed to use Python as a platform-independent > scripting language for run-time tasks, and add Python as a run-time > dependency. > Please vote +1, 0, -1. > > Note that voting -1 on #1 and +1 on #2 essentially REQUIRES contributors to > use Maven plug-ins or Groovy as the only means of cross-platform build-time > tasks, or to simply continue using platform-dependent scripts as is being > done today. > > Vote closes at 12:30pm PST on Saturday 1 December. > --------- > Personally, my vote is +1, +1, +1. > I think #2 is preferable to #1, but still has many unknowns in it, and > until those are worked out I don't want to delay moving to cross-platform > scripts for build-time tasks. > > Best regards, > --Matt > +
Giridharan Kesavan 2012-11-26, 21:16
-
Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stackAlejandro Abdelnur 2012-11-26, 21:52
Matt,
The scope of this vote seems different from what was discussed in the PROPOSAL thread. In the PROPOSAL thread you indicated this was for Hadoop1 because it is ANT based. And the main reason was to remove saveVersion.sh. Your #3 was not discussed in the proposal, was it? It seems this vote is dragging much more stuff it was originally discussed. I think you should suspend the vote, recap the motivation and then restart the vote. As things are laid out at the moment my vote is: -1 (It still seems an overkill to introduce a new runtime requirement for building to replace a script.) +1 (I think this is the right way to simplify the build) -1 (AFAIK there is not such requirement at the moment, and if it comes it would be in the form of an AM, which I'd argue it should leave outside of Hadoop) Thx On Mon, Nov 26, 2012 at 1:16 PM, Giridharan Kesavan < [EMAIL PROTECTED]> wrote: > +1, +1, +1 > > -Giri > > > On Sat, Nov 24, 2012 at 12:13 PM, Matt Foley <[EMAIL PROTECTED]> wrote: > > > For discussion, please see previous thread "[PROPOSAL] introduce Python > as > > build-time and run-time dependency for Hadoop and throughout Hadoop > stack". > > > > This vote consists of three separate items: > > > > 1. Contributors shall be allowed to use Python as a platform-independent > > scripting language for build-time tasks, and add Python as a build-time > > dependency. > > Please vote +1, 0, -1. > > > > 2. Contributors shall be encouraged to use Maven tasks in combination > with > > either plug-ins or Groovy scripts to do cross-platform build-time tasks, > > even under ant in Hadoop-1. > > Please vote +1, 0, -1. > > > > 3. Contributors shall be allowed to use Python as a platform-independent > > scripting language for run-time tasks, and add Python as a run-time > > dependency. > > Please vote +1, 0, -1. > > > > Note that voting -1 on #1 and +1 on #2 essentially REQUIRES contributors > to > > use Maven plug-ins or Groovy as the only means of cross-platform > build-time > > tasks, or to simply continue using platform-dependent scripts as is being > > done today. > > > > Vote closes at 12:30pm PST on Saturday 1 December. > > --------- > > Personally, my vote is +1, +1, +1. > > I think #2 is preferable to #1, but still has many unknowns in it, and > > until those are worked out I don't want to delay moving to cross-platform > > scripts for build-time tasks. > > > > Best regards, > > --Matt > > > -- Alejandro +
Alejandro Abdelnur 2012-11-26, 21:52
-
Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stackRadim Kolar 2012-11-26, 22:17
> In the PROPOSAL thread you indicated this was for Hadoop1 because it is ANT > based. And the main reason was to remove saveVersion.sh. > > Your #3 was not discussed in the proposal, was it? it was part of original proposal but not discussed much because language war was more attractive option. You want vote like this? 1. Using external language vs maven plugin to build 2. Using external language for startup scripts vs JVM script language. Such as Jython use in websphere. 3. Choose python as external language +
Radim Kolar 2012-11-26, 22:17
-
Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stackRobert Evans 2012-11-26, 16:16
+1, +1, 0
On 11/24/12 2:13 PM, "Matt Foley" <[EMAIL PROTECTED]> wrote: >For discussion, please see previous thread "[PROPOSAL] introduce Python as >build-time and run-time dependency for Hadoop and throughout Hadoop >stack". > >This vote consists of three separate items: > >1. Contributors shall be allowed to use Python as a platform-independent >scripting language for build-time tasks, and add Python as a build-time >dependency. >Please vote +1, 0, -1. > >2. Contributors shall be encouraged to use Maven tasks in combination with >either plug-ins or Groovy scripts to do cross-platform build-time tasks, >even under ant in Hadoop-1. >Please vote +1, 0, -1. > >3. Contributors shall be allowed to use Python as a platform-independent >scripting language for run-time tasks, and add Python as a run-time >dependency. >Please vote +1, 0, -1. > >Note that voting -1 on #1 and +1 on #2 essentially REQUIRES contributors >to >use Maven plug-ins or Groovy as the only means of cross-platform >build-time >tasks, or to simply continue using platform-dependent scripts as is being >done today. > >Vote closes at 12:30pm PST on Saturday 1 December. >--------- >Personally, my vote is +1, +1, +1. >I think #2 is preferable to #1, but still has many unknowns in it, and >until those are worked out I don't want to delay moving to cross-platform >scripts for build-time tasks. > >Best regards, >--Matt +
Robert Evans 2012-11-26, 16:16
-
Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stackAdam Berry 2012-11-26, 16:45
0, +1, -1 (non-binding) Also, it feels like maybe the discussion should have been kept open a little longer, thanksgiving holidays last week meant that people may have missed it. Cheers, Adam On Nov 26, 2012, at 10:16 AM, Robert Evans wrote: > +1, +1, 0 > > On 11/24/12 2:13 PM, "Matt Foley" <[EMAIL PROTECTED]> wrote: > >> For discussion, please see previous thread "[PROPOSAL] introduce Python as >> build-time and run-time dependency for Hadoop and throughout Hadoop >> stack". >> >> This vote consists of three separate items: >> >> 1. Contributors shall be allowed to use Python as a platform-independent >> scripting language for build-time tasks, and add Python as a build-time >> dependency. >> Please vote +1, 0, -1. >> >> 2. Contributors shall be encouraged to use Maven tasks in combination with >> either plug-ins or Groovy scripts to do cross-platform build-time tasks, >> even under ant in Hadoop-1. >> Please vote +1, 0, -1. >> >> 3. Contributors shall be allowed to use Python as a platform-independent >> scripting language for run-time tasks, and add Python as a run-time >> dependency. >> Please vote +1, 0, -1. >> >> Note that voting -1 on #1 and +1 on #2 essentially REQUIRES contributors >> to >> use Maven plug-ins or Groovy as the only means of cross-platform >> build-time >> tasks, or to simply continue using platform-dependent scripts as is being >> done today. >> >> Vote closes at 12:30pm PST on Saturday 1 December. >> --------- >> Personally, my vote is +1, +1, +1. >> I think #2 is preferable to #1, but still has many unknowns in it, and >> until those are worked out I don't want to delay moving to cross-platform >> scripts for build-time tasks. >> >> Best regards, >> --Matt > +
Adam Berry 2012-11-26, 16:45
-
Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stackSteve Loughran 2012-11-25, 12:39
On 24 November 2012 20:13, Matt Foley <[EMAIL PROTECTED]> wrote:
> For discussion, please see previous thread "[PROPOSAL] introduce Python as > build-time and run-time dependency for Hadoop and throughout Hadoop stack". > > This vote consists of three separate items: > > 1. Contributors shall be allowed to use Python as a platform-independent > scripting language for build-time tasks, and add Python as a build-time > dependency. > Please vote +1, 0, -1. > > +1 > 2. Contributors shall be encouraged to use Maven tasks in combination with > either plug-ins or Groovy scripts to do cross-platform build-time tasks, > even under ant in Hadoop-1. > Please vote +1, 0, -1. > > +1 My feelings on Maven are well known, but Groovy can mitigate things. And I'm not going to advocate post-M2 build tools such as Gradle. It's ironic that Maven's utter inflexibility forces people to use scripting languages to get their work done, but Groovy is fairly nimble here -and easy to learn for any Java programmer. "Groovy in Action" is the book to own. > 3. Contributors shall be allowed to use Python as a platform-independent > scripting language for run-time tasks, and add Python as a run-time > dependency. > Please vote +1, 0, -1. > +1. I look forward to never having to debug shell script env variable inheritance ever again. This does not mean that I advocate writing big bits of the system in .py; as someone who is debugging OpenStack request throttling this weekend, I know that Python is not "the solution" to problems. For Hadoop it has a role, but the role should be ('better than bash') and ('streaming integration'). > Note that voting -1 on #1 and +1 on #2 essentially REQUIRES contributors to > use Maven plug-ins or Groovy as the only means of cross-platform build-time > tasks, or to simply continue using platform-dependent scripts as is being > done today. > > Vote closes at 12:30pm PST on Saturday 1 December. > --------- > Personally, my vote is +1, +1, +1. > I think #2 is preferable to #1, but still has many unknowns in it, and > until those are worked out I don't want to delay moving to cross-platform > scripts for build-time tasks. > > Best regards, > --Matt > +
Steve Loughran 2012-11-25, 12:39
-
Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stackDoug Cutting 2012-12-03, 18:37
On Sat, Nov 24, 2012 at 12:13 PM, Matt Foley <[EMAIL PROTECTED]> wrote:
> Vote closes at 12:30pm PST on Saturday 1 December. It's not clear to me what kind of a vote this is. It seems closest to a code change vote, since it implies code changes, although without a specific patch yet proposed. As such it would follow lazy consensus rules. Or is it merely intended as a straw poll, to gauge community opinion? Doug +
Doug Cutting 2012-12-03, 18:37
-
Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stackMatt Foley 2012-12-03, 19:21
It is intended to be a "technical discussion", in the sense of the bylaws
statement (in section "Roles and Responsibilities: Committers"), "Committers may cast binding votes on any technical discussion regarding any subproject." I therefore intended it to be a majority vote of Committers. Interestingly, this need to discuss tooling and other issues that go beyond a simple "code change" is not addressed in the "Decision Making: Actions" section of the bylaws. That need seems to have been overlooked in the current rev of that section. But I do not agree that such issues are "code changes"; it relates to the tools we depend on to make code changes, which is clearly qualitatively different. --Matt On Mon, Dec 3, 2012 at 10:37 AM, Doug Cutting <[EMAIL PROTECTED]> wrote: > On Sat, Nov 24, 2012 at 12:13 PM, Matt Foley <[EMAIL PROTECTED]> wrote: > > Vote closes at 12:30pm PST on Saturday 1 December. > > It's not clear to me what kind of a vote this is. It seems closest to > a code change vote, since it implies code changes, although without a > specific patch yet proposed. As such it would follow lazy consensus > rules. Or is it merely intended as a straw poll, to gauge community > opinion? > > Doug > +
Matt Foley 2012-12-03, 19:21
-
Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stackDoug Cutting 2012-12-03, 19:37
On Mon, Dec 3, 2012 at 11:21 AM, Matt Foley <[EMAIL PROTECTED]> wrote:
> It is intended to be a "technical discussion", in the sense of the bylaws > statement (in section "Roles and Responsibilities: Committers"), "Committers > may cast binding votes on any technical discussion regarding any > subproject." I therefore intended it to be a majority vote of Committers. I'm not sure how you conclude that technical discussions are resolved with majority votes. http://www.apache.org/foundation/voting.html > Interestingly, this need to discuss tooling and other issues that go beyond > a simple "code change" is not addressed in the "Decision Making: Actions" > section of the bylaws. That need seems to have been overlooked in the > current rev of that section. But I do not agree that such issues are "code > changes"; it relates to the tools we depend on to make code changes, which > is clearly qualitatively different. I don't see a striking difference between this and a proposed code change. How is a -1 here fundamentally different than a veto on a patch submitted to HADOOP-9082? Doug +
Doug Cutting 2012-12-03, 19:37
-
Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stackMatt Foley 2012-12-03, 22:08
Hi Doug,
The apache voting process contradicts the Hadoop bylaws: http://www.apache.org/foundation/voting.html says that only PMC members can make binding votes on code modification issues, but http://hadoop.apache.org/bylaws.html says that Committers can make binding votes on them. Does that mean the Hadoop bylaws have to change? Thanks, --Matt On Mon, Dec 3, 2012 at 11:37 AM, Doug Cutting <[EMAIL PROTECTED]> wrote: > On Mon, Dec 3, 2012 at 11:21 AM, Matt Foley <[EMAIL PROTECTED]> > wrote: > > It is intended to be a "technical discussion", in the sense of the bylaws > > statement (in section "Roles and Responsibilities: Committers"), > "Committers > > may cast binding votes on any technical discussion regarding any > > subproject." I therefore intended it to be a majority vote of > Committers. > > I'm not sure how you conclude that technical discussions are resolved > with majority votes. > > http://www.apache.org/foundation/voting.html > > > Interestingly, this need to discuss tooling and other issues that go > beyond > > a simple "code change" is not addressed in the "Decision Making: Actions" > > section of the bylaws. That need seems to have been overlooked in the > > current rev of that section. But I do not agree that such issues are > "code > > changes"; it relates to the tools we depend on to make code changes, > which > > is clearly qualitatively different. > > I don't see a striking difference between this and a proposed code > change. How is a -1 here fundamentally different than a veto on a > patch submitted to HADOOP-9082? > > Doug > +
Matt Foley 2012-12-03, 22:08
-
Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stackDoug Cutting 2012-12-03, 23:57
On Mon, Dec 3, 2012 at 2:08 PM, Matt Foley <[EMAIL PROTECTED]> wrote:
> The apache voting process contradicts the Hadoop bylaws: > http://www.apache.org/foundation/voting.html says that only PMC members can > make binding votes on code modification issues, but > http://hadoop.apache.org/bylaws.html says that Committers can make binding > votes on them. Does that mean the Hadoop bylaws have to change? This may be a little atypical but I don't see any harm. The Hadoop PMC is willing to respect the veto of any committer as binding. I'd worry more if we tried to reduce vetoes to a subset of the PMC than extend it to a superset. Do you think this is problematic? Doug +
Doug Cutting 2012-12-03, 23:57
-
Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stackMatt Foley 2012-12-04, 01:22
No, but it speaks to whether the Hadoop bylaws can extend the Apache voting
procedures and draw finer distinctions. For example, the Apache voting procedures only identify 3 types of votable issue, while the Hadoop bylaws identify 9 types of votable issues. If we were forced to fit "development tools" into one of the three categories cited by the Apache voting procedures, it would be fitting a square peg in a round hole. Since we can instead look at the 9 categories provided by the Hadoop bylaws, we can acknowledge that "development tools" was an overlooked category. But in my opinion it certainly doesn't fit into the "code change" category. Tooling is a meta-issue regarding HOW we do what needs to be done. In this case, whether we allow a platform-independent solution, or force contributors to maintain parallel scripts in multiple platform-specific languages for no reason. --Matt On Mon, Dec 3, 2012 at 3:57 PM, Doug Cutting <[EMAIL PROTECTED]> wrote: > On Mon, Dec 3, 2012 at 2:08 PM, Matt Foley <[EMAIL PROTECTED]> wrote: > > The apache voting process contradicts the Hadoop bylaws: > > http://www.apache.org/foundation/voting.html says that only PMC members > can > > make binding votes on code modification issues, but > > http://hadoop.apache.org/bylaws.html says that Committers can make > binding > > votes on them. Does that mean the Hadoop bylaws have to change? > > This may be a little atypical but I don't see any harm. The Hadoop > PMC is willing to respect the veto of any committer as binding. I'd > worry more if we tried to reduce vetoes to a subset of the PMC than > extend it to a superset. > > Do you think this is problematic? > > Doug > +
Matt Foley 2012-12-04, 01:22
-
Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stackDoug Cutting 2012-12-04, 04:50
Hadoop's bylaws do draw finer distinctions than the Apache voting
guidelines document, but we follow the same general principles that are described there. As I understand it, the rationale for using consensus for code is that everyone needs to agree on everything in the codebase or we've disenfranchised some. We share a single code repository and we need to all agree on what goes into it. A release does not require majority since if someone doesn't agree on the timing of a release they can choose to make another at a different time, but every change that goes into each release requires consensus. We also require consensus for committers and PMC member votes so that we have a group that's coherent and is able to reach consensus on code changes. Re-writing bash scripts in Python is neither a release nor other procedural issue. It involves changes to the software we maintain and seems to fall clearly into the "code change" category. If you disagree then perhaps you'd like to propose a change to the bylaws so that scripts have different rules than other kinds of software, but I don't yet see the rationale for such a change. Doug On Mon, Dec 3, 2012 at 5:22 PM, Matt Foley <[EMAIL PROTECTED]> wrote: > No, but it speaks to whether the Hadoop bylaws can extend the Apache voting > procedures and draw finer distinctions. For example, the Apache voting > procedures only identify 3 types of votable issue, while the Hadoop bylaws > identify 9 types of votable issues. > > If we were forced to fit "development tools" into one of the three > categories cited by the Apache voting procedures, it would be fitting a > square peg in a round hole. Since we can instead look at the 9 categories > provided by the Hadoop bylaws, we can acknowledge that "development tools" > was an overlooked category. But in my opinion it certainly doesn't fit > into the "code change" category. Tooling is a meta-issue regarding HOW we > do what needs to be done. In this case, whether we allow a > platform-independent solution, or force contributors to maintain parallel > scripts in multiple platform-specific languages for no reason. > > --Matt > > > On Mon, Dec 3, 2012 at 3:57 PM, Doug Cutting <[EMAIL PROTECTED]> wrote: > >> On Mon, Dec 3, 2012 at 2:08 PM, Matt Foley <[EMAIL PROTECTED]> wrote: >> > The apache voting process contradicts the Hadoop bylaws: >> > http://www.apache.org/foundation/voting.html says that only PMC members >> can >> > make binding votes on code modification issues, but >> > http://hadoop.apache.org/bylaws.html says that Committers can make >> binding >> > votes on them. Does that mean the Hadoop bylaws have to change? >> >> This may be a little atypical but I don't see any harm. The Hadoop >> PMC is willing to respect the veto of any committer as binding. I'd >> worry more if we tried to reduce vetoes to a subset of the PMC than >> extend it to a superset. >> >> Do you think this is problematic? >> >> Doug >> +
Doug Cutting 2012-12-04, 04:50
-
Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stackMatt Foley 2012-12-04, 17:58
Hi Doug,
I didn't read your email until this morning, but I spent time overnight thinking about the Apache Way and reached similar conclusions. While tooling is broader in scope than a single code change, it is a technical choice that we all have to live with. More importantly, "Community over Code" would suggest that if only slightly less than 50% of the community is uncomfortable with adding Python to the mix which is the Hadoop stack, then we probably shouldn't do it, regardless of the technical merits. Therefore, I withdraw the question. We will search for other means of cleaning up the shellscript problem and making all functionality work with parity in the Windows world. I am quite partial to Allen Wittenauer's suggestion in HADOOP-9082<https://issues.apache.org/jira/browse/HADOOP-9082?focusedCommentId=13507163&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13507163> that the scripts should be greatly simplified before dealing with the cross-platform question. It is in many respects silly to have so much functionality "on the side" instead of dealing with it forthrightly in core code. In that spirit, I am also -1 on burying the same complexity in maven plug-ins, which after all just adds another couple layers of complexity, and limits the number of people who understand it, as well. Thanks to all who voted and contributed to the discussion. Best regards, --Matt On Mon, Dec 3, 2012 at 8:50 PM, Doug Cutting <[EMAIL PROTECTED]> wrote: > Hadoop's bylaws do draw finer distinctions than the Apache voting > guidelines document, but we follow the same general principles that > are described there. > > As I understand it, the rationale for using consensus for code is that > everyone needs to agree on everything in the codebase or we've > disenfranchised some. We share a single code repository and we need > to all agree on what goes into it. A release does not require > majority since if someone doesn't agree on the timing of a release > they can choose to make another at a different time, but every change > that goes into each release requires consensus. We also require > consensus for committers and PMC member votes so that we have a group > that's coherent and is able to reach consensus on code changes. > > Re-writing bash scripts in Python is neither a release nor other > procedural issue. It involves changes to the software we maintain and > seems to fall clearly into the "code change" category. > > If you disagree then perhaps you'd like to propose a change to the > bylaws so that scripts have different rules than other kinds of > software, but I don't yet see the rationale for such a change. > > Doug > > On Mon, Dec 3, 2012 at 5:22 PM, Matt Foley <[EMAIL PROTECTED]> wrote: > > No, but it speaks to whether the Hadoop bylaws can extend the Apache > voting > > procedures and draw finer distinctions. For example, the Apache voting > > procedures only identify 3 types of votable issue, while the Hadoop > bylaws > > identify 9 types of votable issues. > > > > If we were forced to fit "development tools" into one of the three > > categories cited by the Apache voting procedures, it would be fitting a > > square peg in a round hole. Since we can instead look at the 9 > categories > > provided by the Hadoop bylaws, we can acknowledge that "development > tools" > > was an overlooked category. But in my opinion it certainly doesn't fit > > into the "code change" category. Tooling is a meta-issue regarding HOW > we > > do what needs to be done. In this case, whether we allow a > > platform-independent solution, or force contributors to maintain parallel > > scripts in multiple platform-specific languages for no reason. > > > > --Matt > > > > > > On Mon, Dec 3, 2012 at 3:57 PM, Doug Cutting <[EMAIL PROTECTED]> wrote: > > > >> On Mon, Dec 3, 2012 at 2:08 PM, Matt Foley <[EMAIL PROTECTED]> > wrote: > >> > The apache voting process contradicts the Hadoop bylaws: > >> > http://www.apache.org/foundation/voting.html says that only PMC +
Matt Foley 2012-12-04, 17:58
-
Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stackRadim Kolar 2012-12-04, 19:41
result of vote is to close
https://issues.apache.org/jira/browse/HADOOP-9073 and write groovy in pom.xml variant (option number 2)? +
Radim Kolar 2012-12-04, 19:41
-
Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stackMatt Foley 2012-12-04, 20:28
Please close HADOOP-9073 as "will not fix", citing this discussion.
I'm -1 on groovy in maven. That's worse, not better. Let it sit for a while and let people propose simplifications of the script situation. Thanks, --Matt On Tue, Dec 4, 2012 at 11:41 AM, Radim Kolar <[EMAIL PROTECTED]> wrote: > result of vote is to close https://issues.apache.org/** > jira/browse/HADOOP-9073<https://issues.apache.org/jira/browse/HADOOP-9073>and write groovy in pom.xml variant (option number 2)? > +
Matt Foley 2012-12-04, 20:28
-
Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stackAlejandro Abdelnur 2012-12-04, 21:00
i've been playing around writing a couple of maven plugins, one to replace saveversion.sh and the other to invoke protoc. they both work in windows standard cmd (no cygwin required). together with hadoop-8887 they would remove most of the scripting done the poms.
(they also work in linux and osx) they are java based, only require having SVN GIT & PROTOC avail in the PATH. if cmake works in windows, i assume hadoop-8887 would be almost there. this would leave the tar stitching, which is done as script to handle SO symlinks. though i have and idea on how we could take care of it. i'll be creating a jira momentarily. thx Alejandro On Dec 4, 2012, at 12:28 PM, Matt Foley <[EMAIL PROTECTED]> wrote: > Please close HADOOP-9073 as "will not fix", citing this discussion. > > I'm -1 on groovy in maven. That's worse, not better. Let it sit for a > while and let people propose simplifications of the script situation. > > Thanks, > --Matt > > > On Tue, Dec 4, 2012 at 11:41 AM, Radim Kolar <[EMAIL PROTECTED]> wrote: > >> result of vote is to close https://issues.apache.org/** >> jira/browse/HADOOP-9073<https://issues.apache.org/jira/browse/HADOOP-9073>and write groovy in pom.xml variant (option number 2)? >> +
Alejandro Abdelnur 2012-12-04, 21:00
-
Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stackMatt Foley 2012-12-04, 22:35
There's already a jira:
HADOOP-8924<https://issues.apache.org/jira/browse/HADOOP-8924> On Tue, Dec 4, 2012 at 1:00 PM, Alejandro Abdelnur <[EMAIL PROTECTED]>wrote: > i've been playing around writing a couple of maven plugins, one to replace > saveversion.sh and the other to invoke protoc. they both work in windows > standard cmd (no cygwin required). together with hadoop-8887 they would > remove most of the scripting done the poms. > > (they also work in linux and osx) > > they are java based, only require having SVN GIT & PROTOC avail in the > PATH. > > if cmake works in windows, i assume hadoop-8887 would be almost there. > > this would leave the tar stitching, which is done as script to handle SO > symlinks. though i have and idea on how we could take care of it. > > i'll be creating a jira momentarily. > > thx > > Alejandro > > On Dec 4, 2012, at 12:28 PM, Matt Foley <[EMAIL PROTECTED]> wrote: > > > Please close HADOOP-9073 as "will not fix", citing this discussion. > > > > I'm -1 on groovy in maven. That's worse, not better. Let it sit for a > > while and let people propose simplifications of the script situation. > > > > Thanks, > > --Matt > > > > > > On Tue, Dec 4, 2012 at 11:41 AM, Radim Kolar <[EMAIL PROTECTED]> wrote: > > > >> result of vote is to close https://issues.apache.org/** > >> jira/browse/HADOOP-9073< > https://issues.apache.org/jira/browse/HADOOP-9073>and write groovy in > pom.xml variant (option number 2)? > >> > +
Matt Foley 2012-12-04, 22:35
|