Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Bigtop >> mail # dev >> Python Fork


Copy link to this message
-
Re: Python Fork
Overall I think I like this - thanks for doing all the work for this fork!
It would indeed be nice to have a more sophisticated language for this.
Some thoughts:

   - If we do make a move to Python, we ought to make sure it works on
   Python 2.4 initially, as we're still doing releases that we intend to
   support on RHEL 5. I haven't tried the fork on RHEL 5 yet, so I'm just
   mentioning it - don't know if there actually are any small issues.
   - do-component-build and install_.sh: I think those should remain as
   bash scripts. Did you have any intention to the contrary?
   - Some of the do-component-build scripts source 'bigtop.bom' to get
   component versions in the environment. We need to make sure we still
   generate that in a shell-friendly form.
   - Don't know how I feel about "demoting" the tests. I've been under the
   impression that they were still maintained (though I admit I haven't paid
   much attention to them myself), but if they're not well maintained, I think
   I'd rather we make a plan to give them more attention than accept their
   unmaintained-ness.

Definitely a big decision to make and there's a lot of discussion and work
to go - but put me down for a +1.

On Wed, Mar 27, 2013 at 6:54 AM, Philip Herron <[EMAIL PROTECTED]>
wrote:
>
> Hey all
>
> I am new to bigtop and I would like to get feedback on this initial
> concept from the community on this.
>
> I have created a fork of bigtop in my own time at:
>
> http://git.buildy.org/?p=bigtop.git;a=shortlog;h=refs/heads/python-fork
>
>   $ git clone git://buildy.org/bigtop.git
>   $ git checkout --track -b python-fork origin/python-fork
>
> This fork of bigtop's toplevel Make/Bash logic within python has a
> number of benefits to the whole community of bigtop for its most
> important function namely packaging.
>
> 1 - Bigtop's logic is handled within gnu/make and bash which is
> detrimental for the long term maintainability of bigtop as it may turn
> away developers.
>
> 2 - Make is hard to debug in this way esp for the need of \ for line
run-on.
>
> 3 - This new directory structure feels more maintainable already with
> bigtop's main function to generate packages for hadoop stack's on
> different platforms. Other functions which (don't seem) to be actively
> maintained like bigtop-deploy and the test framework with regards to
> their documentation.
>
> 4 - Using python we can actually take advantage of the system. Using
> threads to do multiple builds of different components at the same time.
> Yes i know we could probably hack together make -j n but this will
> further make things harder to maintain.
>
> 5 - The bigtop.mk BOM file was difficult to read and find errors with
> eval and ?= as well as having archives set in Makefile and package.mk
> doing a lot of tricky bash to get things going.
>
> Using Python ConfigParser we can have a single BOM file for all the
> dynamic data we care about instead of over several makefiles.
>
> ### bigtop.BOM ###
> [DEFAULT]
> APACHE_MIRROR = http://apache.osuosl.org
> APACHE_ARCHIVE = http://archive.apache.org/dist
> ARCHIVES = %(APACHE_MIRROR)s %(APACHE_ARCHIVE)s
>
> [BIGTOP]
> BOM = hadoop
>
> [hadoop]
> NAME = hadoop
> BASE_VERSION = 2.0.3-alpha
> PKG_VERSION = 2.0.3
> RELEASE = 1
> SRC_TYPE = TARGZ
> SRC = %(NAME)s-%(BASE_VERSION)s-src.tar.gz
> DST = %(NAME)s-%(BASE_VERSION)s.tar.gz
> LOC = %(ARCHIVES)s
> DOWNLOAD_PATH = /hadoop/core/%(NAME)s-%(BASE_VERSION)s
> ### EOF ###
>
> I need to spend time documenting but i think this looks very similar but
> cleaner. We have DEFAULTS section to specify our archives then a BOM
> section listing all the components we want in the stack.
>
> The Hadoop component is the only one i have tested with but i don’t see
> any problems adding others. The LOC field in the hadoop component is
> interesting as we can specify the ARCHIVES and it will look in all for
> the tarball we want if the first was down it will go and look into the
> next one if you have several specified.
fine.
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB