Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Bigtop >> mail # dev >> Python Fork

Copy link to this message
Re: Python Fork
On 27/03/13 15:37, Roman Shaposhnik wrote:
> Hi Philip,
> first of all thanks a bunch for considering to lead with the code -- that's
> the best way to make progress. I have just started to look at your code
> and haven finished reviewing it yet (in fact, I'd rather have somebody
> on this list who appreciates Python doing that -- Steve?) but let me
> quickly answer some of the points you've raised.
> My only metapoint here would be that since we're stuck with Java
> because 90% of Bigtop components are implemented in it
> and we've also chosen it as a platform for iTest -- any solution
> that leverages Java tooling (Gradle, Groovy, etc) would get
> more of my personal enthusiastic embrace than a solution
> that introduces yet another platform to care about (Python,
> Ruby, etc.).

I understand your point here but i think python suits this project much
better than a jvm based language because of the dependency of the jvm to
me isn't scripting. Python gets by this because its much simpler and
installed even in minimal environments of linux by default. And suits
this work very well.

At the moment we care about:

Make/Bash/java/puppet/ruby/kickstart/ From all i can see at the moment,
With python we can unify all of bigtop's functionality properly.

With using puppet and i intend to start a fork of vmbuilder to unify all
these projects in making images of virtual machines. Since vmbuilder is
basically ubuntu only now with centos support floating about in github.
Boxgrinder is kind of gross to get setup because of ruby.

> On Wed, Mar 27, 2013 at 6:54 AM, Philip Herron
> <[EMAIL PROTECTED]> wrote:
>> 1 - Bigtop's logic is handled within gnu/make and bash which is
>> detrimental for the long term maintainability of bigtop as it may turn
>> away developers.
>> 2 - Make is hard to debug in this way esp for the need of \ for line run-on.
> Could you, please, elaborate on what exact problem you were faced
> with when you decided that Python is a better option than Make?
> We all have preferences (for example mine is, honestly, make over
> python) but of course if the current tooling is inadequate in some
> respect it needs to be updated.

Yeah I understand you like make i don't mind it but it doesn't have any
real long term potential. Because i tend to think most people from java
backgrounds will have next to no background with make. I come from a gcc
developer/commiter background so i am used to it. But even then this
isn't a valid use of make in my opinion and i am sure gnu/make
developers would agree this isn't how make should be used.

Take for example, the next latest and greatest component wants to be a
*.{zip,tar.xz.tar.bz2,.xz,*.7z} this is a basic example but its still
valid having the dynamic support for different archives how would we do
this in python we could figure this out. In Bash were going to have to
have more hooks and more obscure bash to do something like that.

>> 3 - This new directory structure feels more maintainable already with
>> bigtop's main function to generate packages for hadoop stack's on
>> different platforms. Other functions which (don't seem) to be actively
>> maintained like bigtop-deploy and the test framework with regards to
>> their documentation.
> Here's my very strong take on it -- without deployment and test Bigtop
> might as well pack its stuff and go home. If we're reduced to a "framework
> that builds packages" there's very little left in the project to keep me
> interested in it.

Deployment and test can still be done and there are 100% important i
think if i come back with a more complete bigtop with these components
my fork might be taken more seriously.

> We *all* should be investing pretty heavily in both. In fact, I hoped
> to make Bigtop 0.6.0 about polishing our tests and deployment
> code, but Hadoop 2.0.3 story changed that. Hopefully Bigtop 0.7.0
> will be a release where we really do invest a lot of energy in
> making both squeaky clean.

I understand this, it will be nice to see them.
Yes its key value you pairs but its horrible to read. Coupled with a
$eval at the end of each component. I dunno this is a glorified config
file and i tend to be very pragmatic about this and think this isn't
what it should look like.
I understand your worry there, this code is pure python no extra depends
for now. So that's good and if i get rid of my with usage i can make it
python >= 2.4 and this is eliminate this worry dead.

Java is by far worse for this problem. example:

And how do you get a standard java version on each linux distro?

I will come back with a more solid implementation with deployment and

NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB