Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Could an EC2 machine to 4 times slower than local dev workstation?

Copy link to this message
Re: Could an EC2 machine to 4 times slower than local dev workstation?
Yes the performance hit is normal.
Looks like you're seeing network latency on disk I/O.
Could also be a tuning issue. (differences in configurations...)

Not sure how much. CPU difference will impact performance, while disk I/O will really kill you.
Sent from a remote device. Please excuse any typos...

Mike Segel

On Dec 30, 2011, at 11:33 AM, Mark Kerzner <[EMAIL PROTECTED]> wrote:

> Thank you, Bryan,
> that is very important and clear some cloudiness in my mind.
> Sincerely,
> Mark
> On Fri, Dec 30, 2011 at 10:54 AM, Bryan Beaudreault <
>> We have also seen this in our testing, though we focused mainly on MR more
>> than HBase.
>> Keep in mind that EC2 Compute Units are defined as follows:
>> The amount of CPU that is allocated to a particular instance is expressed
>>> in terms of these EC2 Compute Units. We use several benchmarks and tests
>> to
>>> manage the consistency and predictability of the performance of an EC2
>>> Compute Unit. One EC2 Compute Unit provides the equivalent CPU capacity
>> of
>>> a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor.
>> This does not even account for CPU contention that Amandeep mentioned,
>> which we have noticed at times as well.  Also, c1.mediums have a I/O
>> Performance rating of "Moderate."  I think this mainly refers to ethernet
>> speed, but it could refer to disk speed as well.
>> If your local workstation is a reasonably modern system, it is very
>> possible for you to see much better performance locally.  The difference
>> between 2.5 1.0 GHz 2007 processors (2.5 compute units) and a modern i5,
>> i7, or equivalent is huge not just in speed and number of cores, but
>> architecture, cache, etc.  In terms of HBase write speed, if you are
>> running on an SSD this could cause a substantial gap as well.
>> On Fri, Dec 30, 2011 at 12:38 AM, Amandeep Khurana <[EMAIL PROTECTED]>
>> wrote:
>>> Is your client program running on the same node? Given that c1.mediums
>> are
>>> on shared hosts, your neighbor might be overloading his VM, causing yours
>>> to starve.
>>> On Fri, Dec 30, 2011 at 9:50 AM, Mark Kerzner <[EMAIL PROTECTED]>
>>> wrote:
>>>> Hi,
>>>> I am running a small program to load about 1 million rows into HBase.
>> It
>>>> takes 200 seconds on my dev machine, and 800 seconds on a c1.medium EC2
>>>> machine. Both are running the same version of Ubuntu and the same
>> version
>>>> of HBase. Everything is local on one machine in both cases.
>>>> What could the difference between the two environments be? I did notice
>>>> that my local machine has higher CPU loads:
>>>> hbase 64%
>>>> java (my app) 38%
>>>> hdfs 20%
>>>> whereas the EC2 machine
>>>> hbase 47%
>>>> java (my app) 23%
>>>> hdfs 14%
>>>> Sincerely,
>>>> Mark