Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop >> mail # dev >> Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stack


+
Matt Foley 2012-11-30, 01:51
+
Alejandro Abdelnur 2012-11-30, 01:25
Copy link to this message
-
Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stack
Hello again.  Crossed in the mail.

* What kind of tasks you envision Python scripts will enable that are
> not possible today?
The point isn't to open brave new worlds.  The point is to avoid the
nightmare of having to maintain multiple "parallel" scripts doing the SAME
THING in multiple scripting languages.  I know from experience that they
never get maintained right.  It's just a huge source of bugs, because when
they are in different languages, it can be quite difficult to determine
that they are *really* doing the same thing.  And in a case like shell vs
powershell, it will be very common to have contributors who are not experts
in both.

I care deeply about having a high-quality release in both Linux and
Windows.  And having a cross-platform scripting language will make it much
easier to maintain that quality over time, without "slip" between the two
platforms.

* Will the requirement of Python be pushed to clients using the
> hadoop script? If so, this would affect all downstream projects that use
> hadoop script in one why or the other, right?
If question #3 passes, then Python will become a run-time dependency for
Hadoop.  That means it would need to be installed as part of the Hadoop
install preparation, just like all the other Hadoop run-time dependencies.

Is the main motivation of the proposal to make things easier for window,
> so there is no need for cygwin? If that is the case, have you considered
> doing directly BAT scripts? If you take Tomcat for example, they have BAT
> scripts and SH scripts and things work quite nicely.
Of course it is sufficient, from the simple implementation perspective, to
translate all the shell scripts into bat or (better) powershell scripts.
 That is, in fact, the most evident alternative to my proposals #1 and #3.

However, I ask -- beg! -- the community to consider it from the software
engineering perspective.  We aren't here to just implement something once
and be done.  It has to be maintained, as most of you on this list are well
aware, for years and years, across multiple generations.  And trying to
maintain parallel scripts in multiple languages, when not necessitated by
genuine platform-specific requirements, is just creating bug generators in
the system.

Personally, I wouldn't be trilled to see the logic in the scripts to
> get more complex, but on the opposite direction; IMO, scripts should be
> trimmed to set env vars (with no voodoo logic), build the classpath (with
> no voodoo logic, just from a set of dirs) and call Java.
See the first item above.  The point is to enable cross-platform scripting
of the things we already have to script.  IMO, scripts should get out of
the env var business entirely, but that's unrelated to this question :-)

Finally, this is code change, so I'm not sure why we are doing a vote.
I view this as a tools issue, that affects questions that go beyond the
one-time choice of how to write (or re-write) saveVersion.sh.  Also Aaron
(atm) recommended that I bring it to the list.  So here we are :-)

Cheers,
--Matt

On Thu, Nov 29, 2012 at 5:25 PM, Alejandro Abdelnur <[EMAIL PROTECTED]>wrote:

> Matt,
>
> Let me repost my previous questions and a few more. I'd appreciate your
> answers, as it will help me understand the full impact this would have in
> Hadoop and related projects.
>
> * Phyton as runtime requirement. Are you planing to migrate all BASH
> scripts provided by Hadoop (or dynamically created -ie launcher scripts)
>  to Phyton?
> * What else in the current build, besides saveVersion.sh, you see as
> candidate to be migrated to Phyton?
> * How are you planning to define what Phyton modules can be used? Will
> developers have to install them manually?
> * What kind of tasks you envision Python scripts will enable that are not
> possible today?
> * Will the requirement of Python be pushed to clients using the hadoop
> script? If so, this would affect all downstream projects that use hadoop
> script in one why or the other, right?
+
Chuan Liu 2012-11-30, 03:22
+
Bikas Saha 2012-11-30, 04:27
+
Luke Lu 2012-11-30, 11:21
+
Luke Lu 2012-11-30, 12:57
+
Steve Loughran 2012-11-30, 13:29
+
Luke Lu 2012-11-30, 14:02
+
Luke Lu 2012-11-30, 13:49
+
Arun C Murthy 2012-12-02, 18:20
+
Radim Kolar 2012-11-30, 00:29
+
Steve Loughran 2012-11-30, 13:20
+
Radim Kolar 2012-11-30, 13:40
+
Jitendra Pandey 2012-11-30, 22:49
+
Steve Loughran 2012-12-01, 10:48
+
Matt Foley 2012-11-24, 20:13
+
Ivan Mitic 2012-11-29, 23:41
+
Mahadevan Venkatraman 2012-11-30, 02:07
+
Raja Aluri 2012-12-01, 00:57
+
Eli Collins 2012-12-01, 01:08
+
Steve Loughran 2012-12-01, 10:44
+
Doug Cutting 2012-12-01, 18:23
+
Konstantin Boudnik 2012-12-13, 00:53
+
Doug Cutting 2012-11-30, 16:55
+
Joep Rottinghuis 2012-12-01, 20:28
+
Eric Yang 2012-12-02, 06:07
+
Konstantin Boudnik 2012-12-13, 00:55
+
Tom White 2012-12-03, 14:23
+
Chris Nauroth 2012-11-25, 07:18
+
Suresh Srinivas 2012-11-26, 20:41
+
Konstantin Boudnik 2012-11-26, 18:30
+
Radim Kolar 2012-11-26, 17:34
+
Colin McCabe 2012-11-26, 16:53
+
Chris Nauroth 2012-11-26, 17:44
+
Luke Lu 2012-11-26, 17:25
+
Giridharan Kesavan 2012-11-26, 21:16
+
Alejandro Abdelnur 2012-11-26, 21:52
+
Radim Kolar 2012-11-26, 22:17
+
Robert Evans 2012-11-26, 16:16
+
Adam Berry 2012-11-26, 16:45
+
Steve Loughran 2012-11-25, 12:39
+
Doug Cutting 2012-12-03, 18:37
+
Matt Foley 2012-12-03, 19:21
+
Doug Cutting 2012-12-03, 19:37
+
Matt Foley 2012-12-03, 22:08
+
Doug Cutting 2012-12-03, 23:57
+
Matt Foley 2012-12-04, 01:22
+
Doug Cutting 2012-12-04, 04:50
+
Matt Foley 2012-12-04, 17:58
+
Radim Kolar 2012-12-04, 19:41
+
Matt Foley 2012-12-04, 20:28
+
Alejandro Abdelnur 2012-12-04, 21:00
+
Matt Foley 2012-12-04, 22:35