Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop, mail # dev - [PROPOSAL] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stack


+
Matt Foley 2012-11-21, 19:15
+
Radim Kolar 2012-11-23, 23:40
+
Matt Foley 2012-11-24, 20:13
+
Radim Kolar 2012-11-24, 21:26
+
Konstantin Boudnik 2012-11-24, 22:03
+
Alejandro Abdelnur 2012-11-21, 19:25
+
Radim Kolar 2012-11-21, 20:46
+
Konstantin Boudnik 2012-11-21, 21:33
+
Konstantin Boudnik 2012-11-21, 20:00
Copy link to this message
-
Re: [PROPOSAL] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stack
Matt Foley 2012-11-21, 21:14
Cos,
Please see in-line.

On Wed, Nov 21, 2012 at 12:00 PM, Konstantin Boudnik <[EMAIL PROTECTED]> wrote:

> I like Alejandro's idea about Maven for a few of reasons:
>   - bringing in a scripting environment which is known for its
> inter-version
>     idiosyncrasies just because Windows can't handle trivial shell
> scripting
>     looks like an overkill to me
>

Excuse me?  Can we at least try not to belittle other people's platforms on
a public Apache forum?  There's nothing trivial about implementing shell on
Windows, as cygwin regrettably proved.
>   - relative to above, there's a chance that Python's pre-requisites used
> in
>     Hadoop might get into a conflict with some other components in the
> stack.
>     This will be a nightmare for the integrator projects i.e. Bigtop
>

Said Bigtop project actually uses python, does it not?
>   - Maven is de-facto standard for Java stacks
>

Sure -- except for when Ant was the de-facto standard for Java stacks.  And
let's remember what maven and ant are/were the de-facto standard for:
 Doing builds.  Not scripting everything that needs scripting.
>   - Maven has built-in scripting language (Groovy) if some plugins aren't
>     sufficient for achieving whatever goals
>

Are you proposing Groovy as a better scripting language than Python?
>
> Addressing Matt's later point about non-Mavenized Hadoop-1 line: it uses
> Maven
> stuff suchs as deploy/install via custom ant tasks. Same approach would
> work
> for saveVersion.sh and others, I am sure.
>

Current ant scripts in Hadoop seem to use maven only for artifact
management via the maven repository.  If I'm missing something, please
point it out.  The ant build task currently calls out to saveVersion.sh.
 Having it call out to maven, which then calls out to a plug-in and/or a
Groovy script, doesn't sound like an improvement to me.  And it's a way
different use of maven than currently in the Hadoop-1 line, not a
continuation of established practice.

--Matt
>
> Cos
>
> On Wed, Nov 21, 2012 at 11:25AM, Alejandro Abdelnur wrote:
> > Hey Matt,
> >
> > We already require java/mvn/protoc/cmake/forrest (forrest is hopefully on
> > its way out with the move of docs to APT)
> >
> > Why not do a maven-plugin to do that?
> >
> > Colin already has something to simplify all the cmake calls from the
> builds
> > using a maven-plugin (https://issues.apache.org/jira/browse/HADOOP-8887)
> >
> > We could do the same with protoc, thus simplifying the POMs.
> >
> > The saveVersion.sh seems like another prime candidate for a maven plugin,
> > and in this case it would not require external tools.
> >
> > Does this make sense?
> >
> > Thx
> >
> > On Wed, Nov 21, 2012 at 11:15 AM, Matt Foley <[EMAIL PROTECTED]> wrote:
> >
> > > This discussion started in
> > > HADOOP-8924<https://issues.apache.org/jira/browse/HADOOP-8924>
> > > , where it was proposed to replace the build-time utility
> "saveVersion.sh"
> > > with a python script.  This would require Python as a build-time
> > > dependency.  Here's the background:
> > >
> > > Those of us involved in the branch-1-win port of Hadoop to Windows
> without
> > > use of Cygwin, have faced the issue of frequent use of shell scripts
> > > throughout the system, both in build time (eg, the utility
> > > "saveVersion.sh"),
> > > and run time (config files like "hadoop-env.sh" and the start/stop
> scripts
> > > in "bin/*" ).  Similar usages exist throughout the Hadoop stack, in all
> > > projects.
> > >
> > > The vast majority of these shell scripts do not do anything platform
> > > specific; they can be expressed in a posix-conforming way.  Therefore,
> it
> > > seems to us that it makes sense to start using a cross-platform
> scripting
> > > language, such as python, in place of shell for these purposes.  For
> those
> > > rare occasions where platform-specific functionality really is needed,
> > > python also supports quite a lot of platform-specific functionality on
> both
> > > Linux and Windows; but where that is inadequate, one could still
+
Konstantin Boudnik 2012-11-21, 21:50
+
Andy Isaacson 2012-11-21, 23:00
+
Radim Kolar 2012-11-21, 23:58
+
Steve Loughran 2012-11-22, 09:21
+
Konstantin Boudnik 2012-11-22, 01:46
+
Radim Kolar 2012-11-22, 01:57
+
Chris Nauroth 2012-11-21, 21:03
+
Radim Kolar 2012-11-21, 21:30
+
Chris Nauroth 2012-11-21, 21:44
+
Radim Kolar 2012-11-21, 23:15
+
Chris Nauroth 2012-11-22, 00:14
+
Radim Kolar 2012-11-22, 01:55
+
Chris Nauroth 2012-11-22, 02:40
+
Radim Kolar 2012-11-22, 14:54
+
Steve Loughran 2012-11-22, 09:02
+
Matt Foley 2012-11-21, 19:44
+
Alejandro Abdelnur 2012-11-21, 19:58
+
Steve Loughran 2012-11-22, 09:14