Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop, mail # dev - [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stack


Copy link to this message
-
Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stack
Steve Loughran 2012-11-30, 13:29
On 30 November 2012 12:57, Luke Lu <[EMAIL PROTECTED]> wrote:

> I'd like to change my binding vote to -1, -0, -1.
>
> Considering the hadoop stack/ecosystem as a whole, I think the best cross
> platform scripting language to adopt is jruby for following reasons:
>
> 1. HBase already adopted jruby for HBase shell, which all current platform
> vendors support.
> 2. We can control the version of language implementation at a per release
> basis.
> 3. We don't have to introduce new dependencies in the de facto hadoop
> stack. (see 1).
>
>
I don't see why these arguments should have any impact on using python at
build time, as it doesn't introduce any dependencies downstream. Yes, you
need python at build time, but that's no worse than having a protoc
compiler, gcc and the automake toolchain.

> I'm all for improving multi-platform support. I think the best way to do
> this is to have a thin native script wrappers (using env vars) to call the
> cross-platform jruby scripts.
>
>
Were it not for the env-var configuration hierarchy mess that things are in
today, I'd agree. where do you set your env vars? hadoop-env.sh? Where does
that come from? the hadoop conf dir? How do you find that? An env variable
or a ../../conf from bin/hadoop.sh which breaks once you start symlinking
to hadoop/bin; or do you assume a root installation in /etc/hadoop/conf,
which points to /etc/alternatives/hadoop-conf, which can then point back to
/etc/hadoop/conf.pseudo ? And what about JAVA_HOME?

Those env vars are  something I'd like see the back of.