Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Why they recommend this (CPU) ?

Copy link to this message
Re: Why they recommend this (CPU) ?
Be sure you are comparing apples to apples.  The E5-2650 has a larger cache than the E5-2640, faster system bus and can support faster (1600Ghz vs 1333Ghz) DRAM resulting in greater potential memory bandwidth.

From: Patrick Angeles <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>
Date: Thursday, October 11, 2012 12:36 PM
Subject: Re: Why they recommend this (CPU) ?

If you look at comparable Intel parts:

Intel E5-2640
6 cores @ 2.5 Ghz
95W - $885

Intel E5-2650
8 cores @ 2.0 Ghz
95W - $1107

So, for $400 more on a dual proc system -- which really isn't much -- you get 2 more cores for a 20% drop in speed. I can believe that for some scenarios, the faster cores would fare better. Gzip compression is one that comes to mind, where you are aggressively trading CPU for lower storage volume and IO. An HBase cluster is another example.

On Thu, Oct 11, 2012 at 3:03 PM, Russell Jurney <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote:
My own clusters are too temporary and virtual for me to notice. I haven't thought of clock speed as having mattered in a long time, so I'm curious what kind of use cases might benefit from faster cores. Is there a category in some way where this sweet spot for faster cores occurs?

Russell Jurney http://datasyndrome.com

On Oct 11, 2012, at 11:39 AM, Ted Dunning <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote:

You should measure your workload.  Your experience will vary dramatically with different computations.

On Thu, Oct 11, 2012 at 10:56 AM, Russell Jurney <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote:
Anyone got data on this? This is interesting, and somewhat counter-intuitive.

Russell Jurney http://datasyndrome.com

On Oct 11, 2012, at 10:47 AM, Jay Vyas <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote:

> Presumably, if you have a reasonable number of cores - speeding the cores up will be better than forking a task into smaller and smaller chunks - because at some point the overhead of multiple processes would be a bottleneck - maybe due to streaming reads and writes?  I'm sure each and every problem has a different sweet spot.