Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop >> mail # dev >> Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stack


+
Matt Foley 2012-11-30, 01:51
+
Alejandro Abdelnur 2012-11-30, 01:25
+
Matt Foley 2012-11-30, 02:26
+
Chuan Liu 2012-11-30, 03:22
+
Bikas Saha 2012-11-30, 04:27
+
Luke Lu 2012-11-30, 11:21
+
Luke Lu 2012-11-30, 12:57
+
Steve Loughran 2012-11-30, 13:29
+
Luke Lu 2012-11-30, 14:02
+
Luke Lu 2012-11-30, 13:49
+
Arun C Murthy 2012-12-02, 18:20
+
Radim Kolar 2012-11-30, 00:29
+
Steve Loughran 2012-11-30, 13:20
+
Radim Kolar 2012-11-30, 13:40
+
Jitendra Pandey 2012-11-30, 22:49
+
Steve Loughran 2012-12-01, 10:48
+
Matt Foley 2012-11-24, 20:13
+
Ivan Mitic 2012-11-29, 23:41
Copy link to this message
-
RE: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stack
+1, +1, +1 (non-binding)

Supporting Comments:

Build-time scripts: Using a platform independent language such as python (or maven in certain cases) will greatly help in reducing build breaks and improve on build script maintainability.

Run-time scripts: Most run-time scripts are end-user visible and are scripts that are needed to be run by admin such as starting/stop Hadoop cluster (hadoop-daemons) or by developers submitting a job (hadoop.cmd). There seem to be two types of script files:
    - Scripts intended for a cluster admin or an IT admin:
        - It is desirable to use a common set of python scripts that work across all platforms. However, in a Windows enterprise environment IT admins won't like it if they have to run python scripts to start/stop a cluster. So for these, there should be a PowerShell interface wrapper that can accept the right parameters and pass it down to the python script. Hopefully, the power-shell layer can be a simple pass-thru. This way the python scripts is like any other Java code hidden behind a well-known API surface. IT Admins can't debug it or modify it easily, but this is fine since for scripts like the aforementioned there isn't a requirement that IT Admins should be able to easily be able to view/modify the underlying code.
       - For Windows specific things not supported by Python natively, such as setting ACLs, starting/stopping windows services it should be possible to re-factor the code appropriately. But a little bit of powershell/cmd for these call outs would be unavoidable.

    - Scripts intended for developers/cluster users:
      - Most of these scripts (e.g. hadoop.cmd) would be behind other API surface such as WebHDFS, ODBC, JDBC, Templeton etc. So the advantage of having a common script across platforms outweighs the use of cmd/powershell as a native windows feature. Again, it should also be possible to provide simple powershell wrappers for a windows environment.

Thanks, Mahadevan.

-----Original Message-----
From: Ivan Mitic [mailto:[EMAIL PROTECTED]]
Sent: Thursday, November 29, 2012 3:41 PM
To: [EMAIL PROTECTED]; [EMAIL PROTECTED]
Subject: RE: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stack

+1, +1, +1 (some comments inline)

-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of Matt Foley
Sent: Saturday, November 24, 2012 12:13 PM
To: [EMAIL PROTECTED]
Subject: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stack

For discussion, please see previous thread "[PROPOSAL] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stack".

This vote consists of three separate items:

1. Contributors shall be allowed to use Python as a platform-independent scripting language for build-time tasks, and add Python as a build-time dependency.
Please vote +1, 0, -1.

2. Contributors shall be encouraged to use Maven tasks in combination with either plug-ins or Groovy scripts to do cross-platform build-time tasks, even under ant in Hadoop-1.
Please vote +1, 0, -1.

>>> I believe 1&2 in combination make a total sense. I ported a few scripts to Python, and thus far, it showed to be up to the task and satisfy the cross-platform requirements. In my option, it is also important to agree on the version, as I've run into some breaking changes in version 3+.
3. Contributors shall be allowed to use Python as a platform-independent scripting language for run-time tasks, and add Python as a run-time dependency.

>>> This is a great aspirational goal! Maintaining two sets of scripts would be a real challenge.
Please vote +1, 0, -1.

Note that voting -1 on #1 and +1 on #2 essentially REQUIRES contributors to use Maven plug-ins or Groovy as the only means of cross-platform build-time tasks, or to simply continue using platform-dependent scripts as is being done today.

Vote closes at 12Personally, my vote is +1, +1, +1.
I think #2 is preferable to #1, but still has many unknowns in it, and until those are worked out I don't want to delay moving to cross-platform scripts for build-time tasks.

Best regards,
+
Raja Aluri 2012-12-01, 00:57
+
Eli Collins 2012-12-01, 01:08
+
Steve Loughran 2012-12-01, 10:44
+
Doug Cutting 2012-12-01, 18:23
+
Konstantin Boudnik 2012-12-13, 00:53
+
Doug Cutting 2012-11-30, 16:55
+
Joep Rottinghuis 2012-12-01, 20:28
+
Eric Yang 2012-12-02, 06:07
+
Konstantin Boudnik 2012-12-13, 00:55
+
Tom White 2012-12-03, 14:23
+
Chris Nauroth 2012-11-25, 07:18
+
Suresh Srinivas 2012-11-26, 20:41
+
Konstantin Boudnik 2012-11-26, 18:30
+
Radim Kolar 2012-11-26, 17:34
+
Colin McCabe 2012-11-26, 16:53
+
Chris Nauroth 2012-11-26, 17:44
+
Luke Lu 2012-11-26, 17:25
+
Giridharan Kesavan 2012-11-26, 21:16
+
Alejandro Abdelnur 2012-11-26, 21:52
+
Radim Kolar 2012-11-26, 22:17
+
Robert Evans 2012-11-26, 16:16
+
Adam Berry 2012-11-26, 16:45
+
Steve Loughran 2012-11-25, 12:39
+
Doug Cutting 2012-12-03, 18:37
+
Matt Foley 2012-12-03, 19:21
+
Doug Cutting 2012-12-03, 19:37
+
Matt Foley 2012-12-03, 22:08
+
Doug Cutting 2012-12-03, 23:57
+
Matt Foley 2012-12-04, 01:22
+
Doug Cutting 2012-12-04, 04:50
+
Matt Foley 2012-12-04, 17:58
+
Radim Kolar 2012-12-04, 19:41
+
Matt Foley 2012-12-04, 20:28
+
Alejandro Abdelnur 2012-12-04, 21:00
+
Matt Foley 2012-12-04, 22:35