Alex, did you run into funky issues with EC2/EMR? The kind of issues that would come up because its a virtualized environment? We currently own our hardware and are just trying to do an ROI analysis on whether moving to Amazon can reduce our admin costs. Currently administering a Hadoop cluster is a bit expensive (in terms of man hours spent trying to replace disks and so on) and we are exploring whether its possible to avoid some of those costs
From: alex bohr <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]
Cc: Dhaval Shah <[EMAIL PROTECTED]>
Sent: Monday, 12 August 2013 1:41 PM
Subject: Re: Hosting Hadoop
I've had good experience running a large hadoop cluster on EC2 instances. After almost 1 year we haven't had any significant down time, just lost a small # of data nodes.
I don't think EMR is an ideal solution if your cluster will be running 24/7.
But for running a large cluster, I don't see how you it's more cost efficient to run in the cloud than to own the hardware and we're trying to move off the cloud onto our own hardware. Can I ask why you're looking to move to the cloud?
On Fri, Aug 9, 2013 at 10:42 AM, Nitin Pawar <[EMAIL PROTECTED]> wrote:
check altiscale as well
>On Fri, Aug 9, 2013 at 3:05 AM, Dhaval Shah <[EMAIL PROTECTED]> wrote:
>Thanks for the list Marcos. I will go through the slides/links. I think that's helpful
>> From: Marcos Luis Ortiz Valmaseda <[EMAIL PROTECTED]>
>>To: Dhaval Shah <[EMAIL PROTECTED]>
>>Cc: [EMAIL PROTECTED]
>>Sent: Thursday, 8 August 2013 4:50 PM
>>Subject: Re: Hosting Hadoop
>>Well, all depends, because many companies use Cloud Computing
>>platforms like Amazon EMR. Vmware, Rackscpace Cloud for Hadoop
>>There a lot of companies using HBase hosted in Cloud. The last
>>HBaseCon was full of great use-cases:
>>HBase at Groupon
>>A great talk by Benoit for Networking design for HBase:
>>Using Coprocessors to Index Columns in an Elasticsearch Cluster
>>2013/8/8, Dhaval Shah <[EMAIL PROTECTED]>:
>>> We are exploring the possibility of hosting Hadoop outside of our data
>>> centers. I am aware that Hadoop in general isn't exactly designed to run on
>>> virtual hardware. So a few questions:
>>> 1. Are there any providers out there who would host Hadoop on dedicated
>>> physical hardware?
>>> 2. Has anyone had success hosting Hadoop on virtualized hardware where 100%
>>> uptime and performance/stability are very important (we use HBase as a real
>>> time database and it needs to be up all the time)?
>>Marcos Ortiz Valmaseda
>>Product Manager at PDVSA