Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # dev >> [PROPOSAL] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stack


Copy link to this message
-
Re: [PROPOSAL] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stack
/Ignore Python 3 for the time being, it's a completely different
language with incompatible syntax and semantics that doesn't support
several currently-important platforms. Maybe in a few years sane people
can consider moving to it, but for now it's best to just stick with the
compatible subset of Python 2.x. [1] the Mercurial project has had a
pretty good experience with this scheme;
http://mercurial.selenic.com/wiki/SupportedPythonVersions they currently
support 2.4 - 2.7 with a few required libraries. They dropped 2.2 and
2.3 support a few years ago due to specific shortcomings on those versions./

I know that Python compatibility can be worked around. I used Python for
few years and wrote about 70k LOC in it until it started to irritate me
that every new version has incompatibilities such as 2.4 vs 2.3 vs 2.5
and it makes maintaining and testing way harder then it should be. Its
not just compatibility with missing library functions. sometimes even
expression evaluated to different value under new version. This was
similar to php 4 to php 5 migration. Today i have 3 versions of python
installed because of software requirements.

For simple scripts it can probably work if you stick to some common subset.

Scripting via maven plugin has advantage that user do not needs to
install anything, there is couple of languages available: scala, groovy,
jelly, jruby. Maybe jython too.