Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop, mail # dev - [DISCUSS] Stop support for non-packaged i.e. developer build single-node installs


Copy link to this message
-
Re: [DISCUSS] Stop support for non-packaged i.e. developer build single-node installs
Eric Yang 2012-01-18, 23:29
Single node developer build may be supported independently from
package installed builds.  There are multiple ways to achieve this.
The current method is using shell script to support different
deployment conditions.  This is a awkward and inefficient way to
maintain compatibility to support packaged build and developer via the
same script.  "mvn package" compiles jar files and native libraries
then place the output in a deployment structure.  Most of the time
developer does not want the compiled jar files and native libraries to
be reorganized.  This would save the compilation time by 30-50%, and
enable developer to focus on making changes rather than
waiting for "mvn package"

I propose to enhance mvn eclipse:eclipse support which creates a
integrated runtime environment for developer to compile and run code
directly within eclipse and let this be the default standard
environment for developer.  The shell script can completely remove
support of running development structure and focus on packaged runtime
structure.  This will provide a cleaner separation for developer and
packager to focus on their tasks at hand.

The only issue for this approach would be for developer that do not
develop with eclipse, then they need to repeat "mvn package" like
packagers.  I think this is a good trade off, and it may improve
Hadoop's ability to stay OS natural, if the development environment
does not depend on shell script.

regards,
Eric

On Wed, Jan 18, 2012 at 12:35 PM, Ravi Prakash <[EMAIL PROTECTED]> wrote:
> I think also at issue is the maven build system. I don't know if its
> intended, but if I change a single .java file, "mvn -Pdist -P-cbuild
> -Dmaven.javadoc.skip -DskipTests install" runs a TON of stuff.... It takes
> a 1m 29s to finish even if there are NO changes anywhere. I doubt this is
> how it should be. Am I doing it right?
>
> In my mind, the gmake model (where a recipe is run only if the target is
> older than the prerequisites) is what maven should do for us. Its senseless
> to rebuild stuff if it hasn't changed. Probably some configuration needs
> fixing.
>
> I'm afraid though that if we fix it, I might end up being too productive
> and fix all of Hadoop.
>
> On Wed, Jan 18, 2012 at 10:38 AM, Tom White <[EMAIL PROTECTED]> wrote:
>
>> Arun,
>>
>> There are instructions on how to run 0.23/trunk from the dev tree here:
>>
>>
>> http://wiki.apache.org/hadoop/HowToSetupYourDevelopmentEnvironment#Run_HDFS_in_pseudo-distributed_mode_from_the_dev_tree
>>
>> http://wiki.apache.org/hadoop/HowToSetupYourDevelopmentEnvironment#Run_MapReduce_in_pseudo-distributed_mode_from_the_dev_tree
>>
>> I think it's useful to be able to do this since the feedback cycle is
>> shorter between making a change and seeing it reflected in running
>> code. I don't think it's either one or the other: I use tarballs too
>> in other circumstances - it's useful to have both.
>>
>> To prevent the packages being broken we should have automated tests
>> that run on nightly builds:
>> https://issues.apache.org/jira/browse/HADOOP-7650.
>>
>> Cheers,
>> Tom
>>
>> On Wed, Jan 18, 2012 at 7:16 AM, Arun C Murthy <[EMAIL PROTECTED]>
>> wrote:
>> > Folks,
>> >
>> >  Somewhere between MR-279 and mavenization we have broken the support
>> for allowing _developers_ to run single-node installs from the non-packaged
>> 'build' i.e. ability to run single-node clusters without the need to use a
>> tarball/rpm etc. (I fully suspect MR-279 is blame as much as anyone else!
>> *smile*)
>> >
>> >  I propose we go ahead and stop support for this officially to prevent
>> confusion I already see among several folks (this has come up several times
>> on our lists in context of hadoop-0.23).
>> >
>> >  Some benefits I can think of:
>> >  a) Focus on fewer 'features' in the core.
>> >  b) Reduce maintenance/complexity in our scripts (bin/hadoop, bin/hdfs
>> etc.).
>> >  d) Force us devs to eat our own dogfood when it comes to packaging etc.
>> (I can think of numerous cases where devs have broken the tarball/rpm