|
|
-
Automate Hadoop installation
praveenesh kumar 2011-12-05, 10:32
Hi all,
Can anyone guide me how to automate the hadoop installation/configuration process? I want to install hadoop on 10-20 nodes which may even exceed to 50-100 nodes ? I know we can use some configuration tools like puppet/or shell-scripts ? Has anyone done it ?
How can we do hadoop installations on so many machines parallely ? What are the best practices for this ?
Thanks, Praveenesh
+
praveenesh kumar 2011-12-05, 10:32
-
RE: Automate Hadoop installation
Sagar Shukla 2011-12-05, 11:14
Hi Praveenesh, I had created VMs images with OS /hadoop nodes pre-configured which I would start as per requirement. But if you plan to do at the hardware level then Linux provides with kickstart type of configuration, which allows OS / Package installations automatically (network configuration is done through DHCP). This requires a TFTP client and DHCP server and hardware supporting network boot capabilities.
Also something like puppet / shell-scripts can be configured like you mentioned, which I have used, but not for Hadoop.
Thanks, Sagar
-----Original Message----- From: praveenesh kumar [mailto:[EMAIL PROTECTED]] Sent: Monday, December 05, 2011 4:02 PM To: [EMAIL PROTECTED] Subject: Automate Hadoop installation
Hi all,
Can anyone guide me how to automate the hadoop installation/configuration process? I want to install hadoop on 10-20 nodes which may even exceed to 50-100 nodes ? I know we can use some configuration tools like puppet/or shell-scripts ? Has anyone done it ?
How can we do hadoop installations on so many machines parallely ? What are the best practices for this ?
Thanks, Praveenesh
DISCLAIMER =========This e-mail may contain privileged and confidential information which is the property of Persistent Systems Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Ltd. does not accept any liability for virus infected mails.
+
Sagar Shukla 2011-12-05, 11:14
-
Re: Automate Hadoop installation
Konstantin Boudnik 2011-12-05, 19:20
These that great project called BigTop (in the apache incubator) which provides for building of Hadoop stack.
The part of what it provides is a set of Puppet recipes which will allow you to do exactly what you're looking for with perhaps some minor corrections.
Serious, look at Puppet - otherwise it will be a living through nightmare of configuration mismanagements.
Cos
On Mon, Dec 05, 2011 at 04:02PM, praveenesh kumar wrote: > Hi all, > > Can anyone guide me how to automate the hadoop installation/configuration > process? > I want to install hadoop on 10-20 nodes which may even exceed to 50-100 > nodes ? > I know we can use some configuration tools like puppet/or shell-scripts ? > Has anyone done it ? > > How can we do hadoop installations on so many machines parallely ? What are > the best practices for this ? > > Thanks, > Praveenesh
+
Konstantin Boudnik 2011-12-05, 19:20
-
Re: Automate Hadoop installation
warren 2011-12-22, 13:27
if you use ubuntu,you can try ubuntu-orchestra-modules-hadoop
On 2011年12月06日 03:20, Konstantin Boudnik wrote: > These that great project called BigTop (in the apache incubator) which > provides for building of Hadoop stack. > > The part of what it provides is a set of Puppet recipes which will allow you > to do exactly what you're looking for with perhaps some minor corrections. > > Serious, look at Puppet - otherwise it will be a living through nightmare of > configuration mismanagements. > > Cos > > On Mon, Dec 05, 2011 at 04:02PM, praveenesh kumar wrote: >> Hi all, >> >> Can anyone guide me how to automate the hadoop installation/configuration >> process? >> I want to install hadoop on 10-20 nodes which may even exceed to 50-100 >> nodes ? >> I know we can use some configuration tools like puppet/or shell-scripts ? >> Has anyone done it ? >> >> How can we do hadoop installations on so many machines parallely ? What are >> the best practices for this ? >> >> Thanks, >> Praveenesh
+
warren 2011-12-22, 13:27
-
RE: Automate Hadoop installation
Tom Wilcox 2011-12-22, 15:57
I'm interested in this too. You might find this article helpful: http://hstack.org/hstack-automated-deployment-using-puppet/Apparently the Adobe SaaS guys are responsible for creating a project for puppet-Hadoop stack deployments... -----Original Message----- From: warren [mailto:[EMAIL PROTECTED]] Sent: 22 December 2011 13:28 To: [EMAIL PROTECTED] Subject: Re: Automate Hadoop installation if you use ubuntu,you can try ubuntu-orchestra-modules-hadoop On 2011年12月06日 03:20, Konstantin Boudnik wrote: > These that great project called BigTop (in the apache incubator) which > provides for building of Hadoop stack. > > The part of what it provides is a set of Puppet recipes which will allow you > to do exactly what you're looking for with perhaps some minor corrections. > > Serious, look at Puppet - otherwise it will be a living through nightmare of > configuration mismanagements. > > Cos > > On Mon, Dec 05, 2011 at 04:02PM, praveenesh kumar wrote: >> Hi all, >> >> Can anyone guide me how to automate the hadoop installation/configuration >> process? >> I want to install hadoop on 10-20 nodes which may even exceed to 50-100 >> nodes ? >> I know we can use some configuration tools like puppet/or shell-scripts ? >> Has anyone done it ? >> >> How can we do hadoop installations on so many machines parallely ? What are >> the best practices for this ? >> >> Thanks, >> Praveenesh
+
Tom Wilcox 2011-12-22, 15:57
-
Re: Automate Hadoop installation
Sam Ritchie 2011-12-31, 05:57
Hey guys, If you're okay with Clojure, I've put together a Hadoop deploy using Pallet. Here's a demo project with instructions: https://github.com/pallet/pallet-hadoop-exampleI routinely boot and configure (from scratch) 50 node clusters on EC2 with no issues. Cheers, Sam On Thu, Dec 22, 2011 at 7:57 AM, Tom Wilcox <[EMAIL PROTECTED]> wrote: > I'm interested in this too. > > You might find this article helpful: > http://hstack.org/hstack-automated-deployment-using-puppet/> > Apparently the Adobe SaaS guys are responsible for creating a project for > puppet-Hadoop stack deployments... > > -----Original Message----- > From: warren [mailto:[EMAIL PROTECTED]] > Sent: 22 December 2011 13:28 > To: [EMAIL PROTECTED] > Subject: Re: Automate Hadoop installation > > if you use ubuntu,you can try ubuntu-orchestra-modules-hadoop > > On 2011年12月06日 03:20, Konstantin Boudnik wrote: > > These that great project called BigTop (in the apache incubator) which > > provides for building of Hadoop stack. > > > > The part of what it provides is a set of Puppet recipes which will allow > you > > to do exactly what you're looking for with perhaps some minor > corrections. > > > > Serious, look at Puppet - otherwise it will be a living through > nightmare of > > configuration mismanagements. > > > > Cos > > > > On Mon, Dec 05, 2011 at 04:02PM, praveenesh kumar wrote: > >> Hi all, > >> > >> Can anyone guide me how to automate the hadoop > installation/configuration > >> process? > >> I want to install hadoop on 10-20 nodes which may even exceed to 50-100 > >> nodes ? > >> I know we can use some configuration tools like puppet/or shell-scripts > ? > >> Has anyone done it ? > >> > >> How can we do hadoop installations on so many machines parallely ? What > are > >> the best practices for this ? > >> > >> Thanks, > >> Praveenesh > > -- Sam Ritchie, Twitter Inc 703.662.1337 @sritchie09 (Too brief? Here's why! http://emailcharter.org)
+
Sam Ritchie 2011-12-31, 05:57
-
Re: Automate Hadoop installation
alo alt 2011-12-06, 07:30
Hi, to deploy software I suggest pulp: https://fedorahosted.org/pulp/wiki/HowToFor a package-based distro (debian, redhat, centos) you can build apache's hadoop, pack it and delpoy. Configs, as Cos say, over puppet. If you use a redhat / centos take a look at spacewalk. best, Alex On Mon, Dec 5, 2011 at 8:20 PM, Konstantin Boudnik <[EMAIL PROTECTED]> wrote: > These that great project called BigTop (in the apache incubator) which > provides for building of Hadoop stack. > > The part of what it provides is a set of Puppet recipes which will allow > you > to do exactly what you're looking for with perhaps some minor corrections. > > Serious, look at Puppet - otherwise it will be a living through nightmare > of > configuration mismanagements. > > Cos > > On Mon, Dec 05, 2011 at 04:02PM, praveenesh kumar wrote: > > Hi all, > > > > Can anyone guide me how to automate the hadoop installation/configuration > > process? > > I want to install hadoop on 10-20 nodes which may even exceed to 50-100 > > nodes ? > > I know we can use some configuration tools like puppet/or shell-scripts ? > > Has anyone done it ? > > > > How can we do hadoop installations on so many machines parallely ? What > are > > the best practices for this ? > > > > Thanks, > > Praveenesh > -- Alexander Lorenz http://mapredit.blogspot.com*P **Think of the environment: please don't print this email unless you really need to.*
+
alo alt 2011-12-06, 07:30
-
Re: Automate Hadoop installation
Praveen Sripati 2011-12-06, 10:06
Also, checkout Ambari ( http://incubator.apache.org/ambari/) which is still in the Incubator status. How does Ambari and Puppet compare? Regards, Praveen On Tue, Dec 6, 2011 at 1:00 PM, alo alt <[EMAIL PROTECTED]> wrote: > Hi, > > to deploy software I suggest pulp: > https://fedorahosted.org/pulp/wiki/HowTo> > For a package-based distro (debian, redhat, centos) you can build apache's > hadoop, pack it and delpoy. Configs, as Cos say, over puppet. If you use a > redhat / centos take a look at spacewalk. > > best, > Alex > > > On Mon, Dec 5, 2011 at 8:20 PM, Konstantin Boudnik <[EMAIL PROTECTED]> wrote: > > > These that great project called BigTop (in the apache incubator) which > > provides for building of Hadoop stack. > > > > The part of what it provides is a set of Puppet recipes which will allow > > you > > to do exactly what you're looking for with perhaps some minor > corrections. > > > > Serious, look at Puppet - otherwise it will be a living through nightmare > > of > > configuration mismanagements. > > > > Cos > > > > On Mon, Dec 05, 2011 at 04:02PM, praveenesh kumar wrote: > > > Hi all, > > > > > > Can anyone guide me how to automate the hadoop > installation/configuration > > > process? > > > I want to install hadoop on 10-20 nodes which may even exceed to 50-100 > > > nodes ? > > > I know we can use some configuration tools like puppet/or > shell-scripts ? > > > Has anyone done it ? > > > > > > How can we do hadoop installations on so many machines parallely ? What > > are > > > the best practices for this ? > > > > > > Thanks, > > > Praveenesh > > > > > > -- > Alexander Lorenz > http://mapredit.blogspot.com> > *P **Think of the environment: please don't print this email unless you > really need to.* >
+
Praveen Sripati 2011-12-06, 10:06
-
Re: Automate Hadoop installation
rishi pathak 2011-12-07, 08:14
Hello Praveen, It might be an overkill but you can tweak HOD for the purpose. It is already bundled with hadoop. You will have to change the way node is allocated and commissioned , i.e. replacing torque resource manager interface with ssh(or other rsh's).
Other way would be use of postinstall in kickstart while node installation. This would help if configuration files are mostly static and there are no frequent hadoop upgrades.
On Mon, Dec 5, 2011 at 4:02 PM, praveenesh kumar <[EMAIL PROTECTED]>wrote:
> Hi all, > > Can anyone guide me how to automate the hadoop installation/configuration > process? > I want to install hadoop on 10-20 nodes which may even exceed to 50-100 > nodes ? > I know we can use some configuration tools like puppet/or shell-scripts ? > Has anyone done it ? > > How can we do hadoop installations on so many machines parallely ? What are > the best practices for this ? > > Thanks, > Praveenesh >
-- --- Rishi Pathak National PARAM Supercomputing Facility C-DAC, Pune, India
+
rishi pathak 2011-12-07, 08:14
-
Re: Automate Hadoop installation
Owen O'Malley 2011-12-07, 18:37
On Mon, Dec 5, 2011 at 2:32 AM, praveenesh kumar <[EMAIL PROTECTED]> wrote: > Hi all, > > Can anyone guide me how to automate the hadoop installation/configuration > process? We are rapidly making progress on Ambari. Ambari is an Apache project that will deploy, configure, and administer Hadoop clusters with all of the related tools (Hadoop, Hbase, Pig, Hive, Zookeeper, etc). We will have a CLI, REST, and Web UI interfaces. Please come check out the project and come join us if you are interested in helping build it: http://incubator.apache.org/ambari/-- Owen
+
Owen O'Malley 2011-12-07, 18:37
-
Re: Automate Hadoop installation
hailong.yang1115 2011-12-05, 13:16
Hi Praveenesh,
I have the same question. Recently we have configured hadoop on 36 nodes. Our practice was to edit all configuration files, including core-site.xml, mapred-site.xml and hdfs-site.xml, on one node(master node). Then we used a shell script to transfer all the configuration files to all the other nodes. We found the most time-consuming part was not the installation and configuration of hadoop, but the process of setup ssh connection for the hadoop user on each node without entering a password, which has to involve some human interactions. So I would be very delighted to know there is already a solution to automate all the setup procedures from the very beginning. Another concern of mine is how should the users know the configurations are optimal for their hadoop cluster. For example, the io.sort.mb parameter, how large should it be set? I think the answer is related to what kind of workloads the users are running and how much hardware resources each node contains. I would like to know are there some guide lines or formulations to help the users to understand which parameters in the configurations files they need to tune to get the optimal performance.
Yours
Hailong *********************************************** * Hailong Yang, PhD. Candidate * Sino-German Joint Software Institute, * School of Computer Science&Engineering, Beihang University * Phone: (86-010)82315908 * Email: [EMAIL PROTECTED] * Address: G413, New Main Building in Beihang University, * No.37 XueYuan Road,HaiDian District, * Beijing,P.R.China,100191 ***********************************************
From: praveenesh kumar Date: 2011-12-05 19:02 To: common-user Subject: Automate Hadoop installation Hi all,
Can anyone guide me how to automate the hadoop installation/configuration process? I want to install hadoop on 10-20 nodes which may even exceed to 50-100 nodes ? I know we can use some configuration tools like puppet/or shell-scripts ? Has anyone done it ?
How can we do hadoop installations on so many machines parallely ? What are the best practices for this ?
Thanks, Praveenesh
+
hailong.yang1115 2011-12-05, 13:16
|
|