Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> Why they recommend this (CPU) ?


+
Patai Sangbutsarakum 2012-10-11, 16:22
+
Jay Vyas 2012-10-11, 17:46
+
Russell Jurney 2012-10-11, 17:56
+
Ted Dunning 2012-10-11, 18:38
+
Russell Jurney 2012-10-11, 19:03
+
Patrick Angeles 2012-10-11, 19:36
Copy link to this message
-
Re: Why they recommend this (CPU) ?
Be sure you are comparing apples to apples.  The E5-2650 has a larger cache than the E5-2640, faster system bus and can support faster (1600Ghz vs 1333Ghz) DRAM resulting in greater potential memory bandwidth.

http://ark.intel.com/compare/64590,64591
From: Patrick Angeles <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>
Reply-To: "[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>" <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>
Date: Thursday, October 11, 2012 12:36 PM
To: "[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>" <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>
Subject: Re: Why they recommend this (CPU) ?

If you look at comparable Intel parts:

Intel E5-2640
6 cores @ 2.5 Ghz
95W - $885

Intel E5-2650
8 cores @ 2.0 Ghz
95W - $1107

So, for $400 more on a dual proc system -- which really isn't much -- you get 2 more cores for a 20% drop in speed. I can believe that for some scenarios, the faster cores would fare better. Gzip compression is one that comes to mind, where you are aggressively trading CPU for lower storage volume and IO. An HBase cluster is another example.

On Thu, Oct 11, 2012 at 3:03 PM, Russell Jurney <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote:
My own clusters are too temporary and virtual for me to notice. I haven't thought of clock speed as having mattered in a long time, so I'm curious what kind of use cases might benefit from faster cores. Is there a category in some way where this sweet spot for faster cores occurs?

Russell Jurney http://datasyndrome.com

On Oct 11, 2012, at 11:39 AM, Ted Dunning <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote:

You should measure your workload.  Your experience will vary dramatically with different computations.

On Thu, Oct 11, 2012 at 10:56 AM, Russell Jurney <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote:
Anyone got data on this? This is interesting, and somewhat counter-intuitive.

Russell Jurney http://datasyndrome.com

On Oct 11, 2012, at 10:47 AM, Jay Vyas <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote:

> Presumably, if you have a reasonable number of cores - speeding the cores up will be better than forking a task into smaller and smaller chunks - because at some point the overhead of multiple processes would be a bottleneck - maybe due to streaming reads and writes?  I'm sure each and every problem has a different sweet spot.
+
Ted Dunning 2012-10-11, 19:56
+
Steve Loughran 2012-10-12, 08:19
+
Russell Jurney 2012-10-13, 07:22
+
Aaron Eng 2012-10-11, 19:15
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB