|
Konstantin Boudnik
2011-02-26, 03:55
Nigel Daley
2011-02-26, 07:47
Konstantin Boudnik
2011-02-26, 19:14
Eric Yang
2011-02-26, 20:43
Konstantin Boudnik
2011-02-26, 22:18
Eric Yang
2011-02-27, 00:03
Konstantin Boudnik
2011-02-27, 00:34
Eric Yang
2011-02-27, 01:38
Konstantin Boudnik
2011-02-27, 03:10
Eric Yang
2011-02-27, 09:32
Konstantin Boudnik
2011-02-27, 19:36
Konstantin Boudnik
2011-02-28, 20:21
Allen Wittenauer
2011-02-28, 20:50
Steve Loughran
2011-03-01, 11:24
|
-
Build/test infrastructureKonstantin Boudnik 2011-02-26, 03:55
Looking at re-occurring build/test-patch problems on hadoop? build machines I
thought of a way to make them: a) all the same (configuration, installed software wise) b) have an effortless system to run upgrades/updates on all of them in a controlled fashion. I would suggest to create Puppet configs (the exact content to be defined) which we'll be checked in SCM (e.g. SVN), Whenever a build host's software is needed to be restored/updated a simple run of Puppet across the machines or change in config and run of Puppet will do the magic for us. If there are no objections from the community I can put together some Puppet recipes which might be evolved as we go. -- Take care, Cos 2CAC 8312 4870 D885 8616 6115 220F 6980 1F27 E622 After all, it is only the mediocre who are always at their best. Jean Giraudoux
-
Re: Build/test infrastructureNigel Daley 2011-02-26, 07:47
+1.
Once HADOOP-7106 is committed, I'd like to propose we create a directory at the same level of common/hdfs/mapreduce to hold build (and deploy) type scripts and files. These would then get branches/tagged with the rest of the release. Nige On Feb 25, 2011, at 7:55 PM, Konstantin Boudnik wrote: > Looking at re-occurring build/test-patch problems on hadoop? build machines I > thought of a way to make them: > a) all the same (configuration, installed software wise) > b) have an effortless system to run upgrades/updates on all of them in a > controlled fashion. > > I would suggest to create Puppet configs (the exact content to be defined) > which we'll be checked in SCM (e.g. SVN), Whenever a build host's software > is needed to be restored/updated a simple run of Puppet across the machines > or change in config and run of Puppet will do the magic for us. > > If there are no objections from the community I can put together some > Puppet recipes which might be evolved as we go. > > -- > Take care, > Cos > 2CAC 8312 4870 D885 8616 6115 220F 6980 1F27 E622 > > After all, it is only the mediocre who are always at their best. > Jean Giraudoux
-
Re: Build/test infrastructureKonstantin Boudnik 2011-02-26, 19:14
On Fri, Feb 25, 2011 at 23:47, Nigel Daley <[EMAIL PROTECTED]> wrote:
> +1. > > Once HADOOP-7106 is committed, I'd like to propose we create a directory at the same level of common/hdfs/mapreduce to hold build (and deploy) type scripts and files. These would then get branches/tagged with the rest of the release. That makes sense, although I don't see changes of the host configurations to happen very often. Cos > Nige > > On Feb 25, 2011, at 7:55 PM, Konstantin Boudnik wrote: > >> Looking at re-occurring build/test-patch problems on hadoop? build machines I >> thought of a way to make them: >> a) all the same (configuration, installed software wise) >> b) have an effortless system to run upgrades/updates on all of them in a >> controlled fashion. >> >> I would suggest to create Puppet configs (the exact content to be defined) >> which we'll be checked in SCM (e.g. SVN), Whenever a build host's software >> is needed to be restored/updated a simple run of Puppet across the machines >> or change in config and run of Puppet will do the magic for us. >> >> If there are no objections from the community I can put together some >> Puppet recipes which might be evolved as we go. >> >> -- >> Take care, >> Cos >> 2CAC 8312 4870 D885 8616 6115 220F 6980 1F27 E622 >> >> After all, it is only the mediocre who are always at their best. >> Jean Giraudoux > >
-
Re: Build/test infrastructureEric Yang 2011-02-26, 20:43
We should be very careful about the approach that we chosen for build/packaging. The current state of hadoop is coupled together due to lack of standardized RPC format. Once this issue is cleared, the community will want to split hdfs and m/r into separated projects at some point. It may be better to ensure project is modularized, and work from the same svn repository. Maven is great for doing this, and most of the build and scripts can be defined in pom.xml. Deployment/test server configuration can be pass in from hudson. We should ensure that build and deployment script do not further couple the project.
Regards, Eric On 2/26/11 11:14 AM, "Konstantin Boudnik" <[EMAIL PROTECTED]> wrote: On Fri, Feb 25, 2011 at 23:47, Nigel Daley <[EMAIL PROTECTED]> wrote: > +1. > > Once HADOOP-7106 is committed, I'd like to propose we create a directory at the same level of common/hdfs/mapreduce to hold build (and deploy) type scripts and files. These would then get branches/tagged with the rest of the release. That makes sense, although I don't see changes of the host configurations to happen very often. Cos > Nige > > On Feb 25, 2011, at 7:55 PM, Konstantin Boudnik wrote: > >> Looking at re-occurring build/test-patch problems on hadoop? build machines I >> thought of a way to make them: >> a) all the same (configuration, installed software wise) >> b) have an effortless system to run upgrades/updates on all of them in a >> controlled fashion. >> >> I would suggest to create Puppet configs (the exact content to be defined) >> which we'll be checked in SCM (e.g. SVN), Whenever a build host's software >> is needed to be restored/updated a simple run of Puppet across the machines >> or change in config and run of Puppet will do the magic for us. >> >> If there are no objections from the community I can put together some >> Puppet recipes which might be evolved as we go. >> >> -- >> Take care, >> Cos >> 2CAC 8312 4870 D885 8616 6115 220F 6980 1F27 E622 >> >> After all, it is only the mediocre who are always at their best. >> Jean Giraudoux > >
-
Re: Build/test infrastructureKonstantin Boudnik 2011-02-26, 22:18
This discussion isn't about build of the product nor about packaging
of it. We are discussing patch validation and snapshot build infrastructure. On Sat, Feb 26, 2011 at 12:43, Eric Yang <[EMAIL PROTECTED]> wrote: > We should be very careful about the approach that we chosen for build/packaging. The current state of hadoop is coupled together due to lack of standardized RPC format. Once this issue is cleared, the community will want to split hdfs and m/r into separated projects at some point. It may be better to ensure project is modularized, and work from the same svn repository. Maven is great for doing this, and most of the build and scripts can be defined in pom.xml. Deployment/test server configuration can be pass in from hudson. We should ensure that build and deployment script do not further couple the project. > > Regards, > Eric > > On 2/26/11 11:14 AM, "Konstantin Boudnik" <[EMAIL PROTECTED]> wrote: > > On Fri, Feb 25, 2011 at 23:47, Nigel Daley <[EMAIL PROTECTED]> wrote: >> +1. >> >> Once HADOOP-7106 is committed, I'd like to propose we create a directory at the same level of common/hdfs/mapreduce to hold build (and deploy) type scripts and files. These would then get branches/tagged with the rest of the release. > > That makes sense, although I don't see changes of the host > configurations to happen very often. > > Cos > >> Nige >> >> On Feb 25, 2011, at 7:55 PM, Konstantin Boudnik wrote: >> >>> Looking at re-occurring build/test-patch problems on hadoop? build machines I >>> thought of a way to make them: >>> a) all the same (configuration, installed software wise) >>> b) have an effortless system to run upgrades/updates on all of them in a >>> controlled fashion. >>> >>> I would suggest to create Puppet configs (the exact content to be defined) >>> which we'll be checked in SCM (e.g. SVN), Whenever a build host's software >>> is needed to be restored/updated a simple run of Puppet across the machines >>> or change in config and run of Puppet will do the magic for us. >>> >>> If there are no objections from the community I can put together some >>> Puppet recipes which might be evolved as we go. >>> >>> -- >>> Take care, >>> Cos >>> 2CAC 8312 4870 D885 8616 6115 220F 6980 1F27 E622 >>> >>> After all, it is only the mediocre who are always at their best. >>> Jean Giraudoux >> >> > >
-
Re: Build/test infrastructureEric Yang 2011-02-27, 00:03
The proposed test automation process hasn't been thought through. Apache Hudson has been setup to trigger patch builds, and setup pre-commit test environment. Unfortunately, the current setup needs refinement with proper source code setup to make the builds working again. Ideally, the test cycle have a commit build which runs simple unit tests, and a secondary build (every 24 hours) to run more through tests on multiple machine setup. The test cluster should be cleansed after every secondary build, and ideally this is done in a sandbox approach. However, I don't think bring in puppet environment setup is making the test system reproducible. Consequently, it may be better to have the cluster test setup as part of scripts in maven integration test phase. This will enable any hadoop developer to setup his own test cluster without setup puppet master. I am not fixated on build and packaging only, but express my opinions on improving build system and making the system easier to reproduce.
Regards, Eric On 2/26/11 2:18 PM, "Konstantin Boudnik" <[EMAIL PROTECTED]> wrote: This discussion isn't about build of the product nor about packaging of it. We are discussing patch validation and snapshot build infrastructure. On Sat, Feb 26, 2011 at 12:43, Eric Yang <[EMAIL PROTECTED]> wrote: > We should be very careful about the approach that we chosen for build/packaging. The current state of hadoop is coupled together due to lack of standardized RPC format. Once this issue is cleared, the community will want to split hdfs and m/r into separated projects at some point. It may be better to ensure project is modularized, and work from the same svn repository. Maven is great for doing this, and most of the build and scripts can be defined in pom.xml. Deployment/test server configuration can be pass in from hudson. We should ensure that build and deployment script do not further couple the project. > > Regards, > Eric > > On 2/26/11 11:14 AM, "Konstantin Boudnik" <[EMAIL PROTECTED]> wrote: > > On Fri, Feb 25, 2011 at 23:47, Nigel Daley <[EMAIL PROTECTED]> wrote: >> +1. >> >> Once HADOOP-7106 is committed, I'd like to propose we create a directory at the same level of common/hdfs/mapreduce to hold build (and deploy) type scripts and files. These would then get branches/tagged with the rest of the release. > > That makes sense, although I don't see changes of the host > configurations to happen very often. > > Cos > >> Nige >> >> On Feb 25, 2011, at 7:55 PM, Konstantin Boudnik wrote: >> >>> Looking at re-occurring build/test-patch problems on hadoop? build machines I >>> thought of a way to make them: >>> a) all the same (configuration, installed software wise) >>> b) have an effortless system to run upgrades/updates on all of them in a >>> controlled fashion. >>> >>> I would suggest to create Puppet configs (the exact content to be defined) >>> which we'll be checked in SCM (e.g. SVN), Whenever a build host's software >>> is needed to be restored/updated a simple run of Puppet across the machines >>> or change in config and run of Puppet will do the magic for us. >>> >>> If there are no objections from the community I can put together some >>> Puppet recipes which might be evolved as we go. >>> >>> -- >>> Take care, >>> Cos >>> 2CAC 8312 4870 D885 8616 6115 220F 6980 1F27 E622 >>> >>> After all, it is only the mediocre who are always at their best. >>> Jean Giraudoux >> >> > >
-
Re: Build/test infrastructureKonstantin Boudnik 2011-02-27, 00:34
Apparently you are talking about something else, but I will bite...
On Sat, Feb 26, 2011 at 04:03PM, Eric Yang wrote: > The proposed test automation process hasn't been thought through. Apache > Hudson has been setup to trigger patch builds, and setup pre-commit test > environment. Unfortunately, the current setup needs refinement with proper > source code setup to make the builds working again. Ideally, the test cycle > have a commit build which runs simple unit tests, and a secondary build > (every 24 hours) to run more through tests on multiple machine setup. The > test cluster should be cleansed after every secondary build, and ideally We don't have a test cluster for Apache Hadoop validation. All I am focusing on is build and patch validation infrastructure. > this is done in a sandbox approach. However, I don't think bring in puppet > environment setup is making the test system reproducible. Consequently, it If a specialized and highly scalable host configuration system such as Puppet doesn't guarantee configuration reproducibility then I am not sure what else will. Say, Y! uses that proprietary Igor environment exactly for these purposes but of course it is highly coupled with yinst format and can't be used anywhere else. > may be better to have the cluster test setup as part of scripts in maven > integration test phase. This will enable any hadoop developer to setup his Doing deployment from a build system is certainly possible, but is suboptimal because it pollutes the build with HW/OS details, deployment scripts and such. Besides, last time I've checked Hadoop was built by Ant. > own test cluster without setup puppet master. I am not fixated on build and You don't need to setup puppet muster in order to bounce a node. Puppet works i a client-only mode just as perfect. Cos > packaging only, but express my opinions on improving build system and making > the system easier to reproduce. > > Regards, > Eric > > On 2/26/11 2:18 PM, "Konstantin Boudnik" <[EMAIL PROTECTED]> wrote: > > This discussion isn't about build of the product nor about packaging > of it. We are discussing patch validation and snapshot build > infrastructure. > > On Sat, Feb 26, 2011 at 12:43, Eric Yang <[EMAIL PROTECTED]> wrote: > > We should be very careful about the approach that we chosen for > > build/packaging. The current state of hadoop is coupled together due to > > lack of standardized RPC format. Once this issue is cleared, the > > community will want to split hdfs and m/r into separated projects at some > > point. It may be better to ensure project is modularized, and work from > > the same svn repository. Maven is great for doing this, and most of the > > build and scripts can be defined in pom.xml. Deployment/test server > > configuration can be pass in from hudson. We should ensure that build and > > deployment script do not further couple the project. > > > > Regards, > > Eric > > > > On 2/26/11 11:14 AM, "Konstantin Boudnik" <[EMAIL PROTECTED]> wrote: > > > > On Fri, Feb 25, 2011 at 23:47, Nigel Daley <[EMAIL PROTECTED]> wrote: > >> +1. > >> > >> Once HADOOP-7106 is committed, I'd like to propose we create a directory at the same level of common/hdfs/mapreduce to hold build (and deploy) type scripts and files. These would then get branches/tagged with the rest of the release. > > > > That makes sense, although I don't see changes of the host > > configurations to happen very often. > > > > Cos > > > >> Nige > >> > >> On Feb 25, 2011, at 7:55 PM, Konstantin Boudnik wrote: > >> > >>> Looking at re-occurring build/test-patch problems on hadoop? build machines I > >>> thought of a way to make them: > >>> a) all the same (configuration, installed software wise) > >>> b) have an effortless system to run upgrades/updates on all of them in a > >>> controlled fashion. > >>> > >>> I would suggest to create Puppet configs (the exact content to be defined) > >>> which we'll be checked in SCM (e.g. SVN), Whenever a build host's software
-
Re: Build/test infrastructureEric Yang 2011-02-27, 01:38
On 2/26/11 4:34 PM, "Konstantin Boudnik" <[EMAIL PROTECTED]> wrote:
> Apparently you are talking about something else, but I will bite... > > On Sat, Feb 26, 2011 at 04:03PM, Eric Yang wrote: >> The proposed test automation process hasn't been thought through. Apache >> Hudson has been setup to trigger patch builds, and setup pre-commit test >> environment. Unfortunately, the current setup needs refinement with proper >> source code setup to make the builds working again. Ideally, the test cycle >> have a commit build which runs simple unit tests, and a secondary build >> (every 24 hours) to run more through tests on multiple machine setup. The >> test cluster should be cleansed after every secondary build, and ideally > > We don't have a test cluster for Apache Hadoop validation. All I am focusing > on is build and patch validation infrastructure. If the plan is using puppet agent without puppet master for configuring the system locally to test patch builds. It is probably using the wrong tool for the job. The value of puppet is to be able to configure heterogeneous services across machines in a consistent manner. Is there plan to deploy multiple services across machines? If the purpose is using puppet for config templates, ant or maven can do the job equally well. > Doing deployment from a build system is certainly possible, but is suboptimal > because it pollutes the build with HW/OS details, deployment scripts and such. > Besides, last time I've checked Hadoop was built by Ant. Deploy to remote machine can be as simple as scp tarball, extra, apply template, and run it. None of this requires puppet. Instead of ant + puppet combination, the patch test build structure could be simplified by using maven + shell scripts. Regards, Eric > You don't need to setup puppet muster in order to bounce a node. Puppet works > i a client-only mode just as perfect. > > Cos > >> packaging only, but express my opinions on improving build system and making >> the system easier to reproduce. >> >> Regards, >> Eric >> >> On 2/26/11 2:18 PM, "Konstantin Boudnik" <[EMAIL PROTECTED]> wrote: >> >> This discussion isn't about build of the product nor about packaging >> of it. We are discussing patch validation and snapshot build >> infrastructure. >> >> On Sat, Feb 26, 2011 at 12:43, Eric Yang <[EMAIL PROTECTED]> wrote: >>> We should be very careful about the approach that we chosen for >>> build/packaging. The current state of hadoop is coupled together due to >>> lack of standardized RPC format. Once this issue is cleared, the >>> community will want to split hdfs and m/r into separated projects at some >>> point. It may be better to ensure project is modularized, and work from >>> the same svn repository. Maven is great for doing this, and most of the >>> build and scripts can be defined in pom.xml. Deployment/test server >>> configuration can be pass in from hudson. We should ensure that build and >>> deployment script do not further couple the project. >>> >>> Regards, >>> Eric >>> >>> On 2/26/11 11:14 AM, "Konstantin Boudnik" <[EMAIL PROTECTED]> wrote: >>> >>> On Fri, Feb 25, 2011 at 23:47, Nigel Daley <[EMAIL PROTECTED]> wrote: >>>> +1. >>>> >>>> Once HADOOP-7106 is committed, I'd like to propose we create a directory at >>>> the same level of common/hdfs/mapreduce to hold build (and deploy) type >>>> scripts and files. These would then get branches/tagged with the rest of >>>> the release. >>> >>> That makes sense, although I don't see changes of the host >>> configurations to happen very often. >>> >>> Cos >>> >>>> Nige >>>> >>>> On Feb 25, 2011, at 7:55 PM, Konstantin Boudnik wrote: >>>> >>>>> Looking at re-occurring build/test-patch problems on hadoop? build >>>>> machines I >>>>> thought of a way to make them: >>>>> a) all the same (configuration, installed software wise) >>>>> b) have an effortless system to run upgrades/updates on all of them in a >>>>> controlled fashion. >>>>> >>>>> I would suggest to create Puppet configs (the exact content to be defined)
-
Re: Build/test infrastructureKonstantin Boudnik 2011-02-27, 03:10
On Sat, Feb 26, 2011 at 05:38PM, Eric Yang wrote:
> On 2/26/11 4:34 PM, "Konstantin Boudnik" <[EMAIL PROTECTED]> wrote: > > > Apparently you are talking about something else, but I will bite... > > > > On Sat, Feb 26, 2011 at 04:03PM, Eric Yang wrote: > >> The proposed test automation process hasn't been thought through. Apache > >> Hudson has been setup to trigger patch builds, and setup pre-commit test > >> environment. Unfortunately, the current setup needs refinement with proper > >> source code setup to make the builds working again. Ideally, the test cycle > >> have a commit build which runs simple unit tests, and a secondary build > >> (every 24 hours) to run more through tests on multiple machine setup. The > >> test cluster should be cleansed after every secondary build, and ideally > > > > We don't have a test cluster for Apache Hadoop validation. All I am focusing > > on is build and patch validation infrastructure. > > If the plan is using puppet agent without puppet master for configuring the > system locally to test patch builds. It is probably using the wrong tool > for the job. The value of puppet is to be able to configure heterogeneous > services across machines in a consistent manner. Is there plan to deploy This is simply not the only value of the tool. It allows to maintain OS configurations and system packages installation as easy for 1 host as for 1000 of them. Here's one of a many examples http://hstack.org/hstack-automated-deployment-using-puppet/ BTW, Puppet and Chef recipes are very widely used by all sorts of Ops and cluster management companies. Perhaps, Maven and shell too - I'm not in a position to make a judgement call. I'll let Y! Grid Ops to comment on it - they know everything about sizable clusters configuration management and tools for the job. > multiple services across machines? If the purpose is using puppet for > config templates, ant or maven can do the job equally well. Are you suggesting that it is easier to install patch and gcc packages of version X.X.Z from a Maven build than from Puppet or Chef? If so - please cut such a patch for the community to review. That'd be great a great contribution! Furthermore, my Puppet knowledge is very limited and I am for sure no expert in maven. I have some concern however: - how to provide privileged access - how and where to store host configurations (i.e. packages names, versions, which are gonna be different for difference OSes) - how to do native packages (see above example) and native dependency management from maven? With shell scripting? - how to maintain such a construct? I can continue for a long time, but I'd rather want to solve an issue of managing build host configurations/package sets in a most efficient and sustainable manner. In a properly designed CI system build shouldn't be responsible to configure its operation environment. It might and should check if everything is in place (and crash/report accordingly). But if my Ant script goes around to download, install and god forbids compiles some chunks of my OS I soon will end up with an elegance of Python or some such. > > Doing deployment from a build system is certainly possible, but is suboptimal > > because it pollutes the build with HW/OS details, deployment scripts and such. > > Besides, last time I've checked Hadoop was built by Ant. > > Deploy to remote machine can be as simple as scp tarball, extra, apply > template, and run it. None of this requires puppet. Instead of ant + > puppet combination, the patch test build structure could be simplified by > using maven + shell scripts. Sorry, but Maven + shell script can be called simplification only in a pipe dream ;) Maven is a build tool. A relatively good one perhaps, but just a build tool. Certainly everything can be done with a combination of a shell scripting + tar balls and a little SSH sugar topping. But I'd rather use a accurately designed and supported tool (Puppet, Chef, etc.). And BTW - Hadoop builds aren't maven'ized yet. Which renders most of the argument a time waste until that problem is solved. At any rate, HADOOP-7157 is the JIRA for this. Please comment on it. Cos
-
Re: Build/test infrastructureEric Yang 2011-02-27, 09:32
On 2/26/11 7:10 PM, "Konstantin Boudnik" <[EMAIL PROTECTED]> wrote:
> On Sat, Feb 26, 2011 at 05:38PM, Eric Yang wrote: > > Furthermore, my Puppet knowledge is very limited and I am for sure no expert > in maven. I have some concern however: > - how to provide privileged access > - how and where to store host configurations (i.e. packages names, versions, > which are gonna be different for difference OSes) > - how to do native packages (see above example) and native dependency > management from maven? With shell scripting? > - how to maintain such a construct? > I can continue for a long time, but I'd rather want to solve an issue of > managing build host configurations/package sets in a most efficient and > sustainable manner. Hudson already supports chroot jail environment. It is easy to setup privileged access in the jailed environment by giving the hudson running user sudo access to the jailed environment. The host configuration can be mirrored into chroot environment with minimum set of shell commands. > Sorry, but Maven + shell script can be called simplification only in a pipe > dream ;) Maven is a build tool. A relatively good one perhaps, but just a > build tool. Certainly everything can be done with a combination of a shell > scripting + tar balls and a little SSH sugar topping. But I'd rather use a > accurately designed and supported tool (Puppet, Chef, etc.). Maven supports various kind of remote deployment plugin. Exec plugin with shell script is the easiest one to implement. There are also plugin like cargo for more complex container deployment. There is a plan to write a deployment framework for hadoop for large scale deployment. This project is in planning stage. The scope is deploying the entire hadoop stack (hdfs, mr, zookeeper, hbase, pig, hive, and chukwa) to multiple large clusters. Similar to what you are planning except at the scale that it would make sense to use puppet+mcollective. We had done the evaluation, and found per puppet master would not scale well after 1800 nodes, and multilayer puppeteer spamming tree to cover all our nodes, is not ideal. We choose to use chef-solo for edge deployment. The rest of the details are to be worked out. This is the reason that I am interested on the test environment that is being planned here. It will be possible to use "to be invented" framework in the hudson. This system is not going to be grown in ant/maven build script, hence it will be better to keep build system simple for now. Regards, Eric > > And BTW - Hadoop builds aren't maven'ized yet. Which renders most of the > argument a time waste until that problem is solved. > > At any rate, HADOOP-7157 is the JIRA for this. Please comment on it. > > Cos > >> Regards, >> Eric >> >>> You don't need to setup puppet muster in order to bounce a node. Puppet >>> works >>> i a client-only mode just as perfect. >>> >>> Cos >>> >> >>>> packaging only, but express my opinions on improving build system and >>>> making >>>> the system easier to reproduce. >>>> >>>> Regards, >>>> Eric >>>> >>>> On 2/26/11 2:18 PM, "Konstantin Boudnik" <[EMAIL PROTECTED]> wrote: >>>> >>>> This discussion isn't about build of the product nor about packaging >>>> of it. We are discussing patch validation and snapshot build >>>> infrastructure. >>>> >>>> On Sat, Feb 26, 2011 at 12:43, Eric Yang <[EMAIL PROTECTED]> wrote: >>>>> We should be very careful about the approach that we chosen for >>>>> build/packaging. The current state of hadoop is coupled together due to >>>>> lack of standardized RPC format. Once this issue is cleared, the >>>>> community will want to split hdfs and m/r into separated projects at some >>>>> point. It may be better to ensure project is modularized, and work from >>>>> the same svn repository. Maven is great for doing this, and most of the >>>>> build and scripts can be defined in pom.xml. Deployment/test server >>>>> configuration can be pass in from hudson. We should ensure that build and
-
Re: Build/test infrastructureKonstantin Boudnik 2011-02-27, 19:36
And now jailed environments with Hudson - oh yes, lovely yinst all over again.
Eric, the glorious plans about a framework to deploy thousand of nodes by the click of a button are awesome indeed. But you apparently ignoring the point: - this is about synchronization of 10 build machines at the level of system packaging and operating system configuration. With warmest regards, Cos On Sun, Feb 27, 2011 at 01:32, Eric Yang <[EMAIL PROTECTED]> wrote: > On 2/26/11 7:10 PM, "Konstantin Boudnik" <[EMAIL PROTECTED]> wrote: > >> On Sat, Feb 26, 2011 at 05:38PM, Eric Yang wrote: >> >> Furthermore, my Puppet knowledge is very limited and I am for sure no expert >> in maven. I have some concern however: >> - how to provide privileged access >> - how and where to store host configurations (i.e. packages names, versions, >> which are gonna be different for difference OSes) >> - how to do native packages (see above example) and native dependency >> management from maven? With shell scripting? >> - how to maintain such a construct? >> I can continue for a long time, but I'd rather want to solve an issue of >> managing build host configurations/package sets in a most efficient and >> sustainable manner. > > Hudson already supports chroot jail environment. It is easy to setup > privileged access in the jailed environment by giving the hudson running > user sudo access to the jailed environment. The host configuration can be > mirrored into chroot environment with minimum set of shell commands..? > >> Sorry, but Maven + shell script can be called simplification only in a pipe >> dream ;) Maven is a build tool. A relatively good one perhaps, but just a >> build tool. Certainly everything can be done with a combination of a shell >> scripting + tar balls and a little SSH sugar topping. But I'd rather use a >> accurately designed and supported tool (Puppet, Chef, etc.). > > Maven supports various kind of remote deployment plugin. Exec plugin with > shell script is the easiest one to implement. There are also plugin like > cargo for more complex container deployment. There is a plan to write a > deployment framework for hadoop for large scale deployment. This project is > in planning stage. The scope is deploying the entire hadoop stack (hdfs, > mr, zookeeper, hbase, pig, hive, and chukwa) to multiple large clusters. > Similar to what you are planning except at the scale that it would make > sense to use puppet+mcollective. We had done the evaluation, and found per > puppet master would not scale well after 1800 nodes, and multilayer > puppeteer spamming tree to cover all our nodes, is not ideal. We choose to > use chef-solo for edge deployment. The rest of the details are to be worked > out. This is the reason that I am interested on the test environment that > is being planned here. It will be possible to use "to be invented" > framework in the hudson. This system is not going to be grown in ant/maven > build script, hence it will be better to keep build system simple for now. > > Regards, > Eric > >> >> And BTW - Hadoop builds aren't maven'ized yet. Which renders most of the >> argument a time waste until that problem is solved. >> >> At any rate, HADOOP-7157 is the JIRA for this. Please comment on it. >> >> Cos >> >>> Regards, >>> Eric >>> >>>> You don't need to setup puppet muster in order to bounce a node. Puppet >>>> works >>>> i a client-only mode just as perfect. >>>> >>>> Cos >>>> >>> >>>>> packaging only, but express my opinions on improving build system and >>>>> making >>>>> the system easier to reproduce. >>>>> >>>>> Regards, >>>>> Eric >>>>> >>>>> On 2/26/11 2:18 PM, "Konstantin Boudnik" <[EMAIL PROTECTED]> wrote: >>>>> >>>>> This discussion isn't about build of the product nor about packaging >>>>> of it. We are discussing patch validation and snapshot build >>>>> infrastructure. >>>>> >>>>> On Sat, Feb 26, 2011 at 12:43, Eric Yang <[EMAIL PROTECTED]> wrote: >>>>>> We should be very careful about the approach that we chosen for
-
Re: Build/test infrastructureKonstantin Boudnik 2011-02-28, 20:21
On Mon, Feb 28, 2011 at 12:03, Rajiv Chittajallu <[EMAIL PROTECTED]> wrote:
> --- On Sun, 2/27/11, Konstantin Boudnik <[EMAIL PROTECTED]> wrote: > >> And now jailed environments with >> Hudson - oh yes, lovely yinst all over again. > > From where did the 'yinst' come in to this discussion? Since you know something about a company's > internal tools, don't assume every one posting from that company is talking about it. True, it doesn't imply. It was my assumption based on chrooting/jailing suggestion above. Of course, it doesn't means yinst or anything related to it. Sorry if I have offended anyone by this reference. Cos
-
Re: Build/test infrastructureAllen Wittenauer 2011-02-28, 20:50
On Feb 26, 2011, at 7:10 PM, Konstantin Boudnik wrote: > BTW, Puppet and Chef recipes are very widely used by all sorts of Ops and > cluster management companies. Perhaps, Maven and shell too - I'm not in a > position to make a judgement call. I'll let Y! Grid Ops to comment on it - > they know everything about sizable clusters configuration management and tools > for the job. I'm not in Y! Grid Ops, but from where I sit, it sounds like you are solving the wrong problem. Getting the build machines to be the same should mostly be a one-time issue. If non-ops-types are messing with the package load, then that's a privilege and process problem not an automation problem. The success of a configuration management utility is directly correlated to the amount of work that people are willing to put into using them.
-
Re: Build/test infrastructureSteve Loughran 2011-03-01, 11:24
On 27/02/11 01:38, Eric Yang wrote:
> Instead of ant + > puppet combination, the patch test build structure could be simplified by > using maven + shell scripts. The trouble with any scripted approach vs state driven CM tooling is their inability to cope with a starting point that doesn't match the scripts assumptions. This isn't me picking on maven -different topic- more the CM problem. Lots of people do use shell scripts, and if your starting point is known -such as a golden VM image- it mostly works. Mostly. Even RPM installation (remember, it's scripts underneath) ensure that if you kickstart install and then update a thousand servers, they will end up executing their scripts in different orders, and so end up in different final states. Of course, the weakness of CM tools is that they often end up being given goals that can't be consistently satisfied, so they cycle between the set of sub-optimal configurations. |