Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> RE: HyperThreading in TaskTracker nodes?


Copy link to this message
-
Re: HyperThreading in TaskTracker nodes?
Power issues aside, I've seen similar sorts of performance gains for MR
workloads - around 15-20%.

I think a fair bit of it is due to poor CPU cache utilization in various
parts of Hadoop - hyperthreading gets some extra parallelism there while
the core is waiting on round trips to DRAM.

-Todd

On Tue, Feb 5, 2013 at 10:03 AM, Brad Sarsfield <[EMAIL PROTECTED]> wrote:

> Hate to say it, but HyperThreading can have either positive or negative
> performance characteristics.  It all depends on your workload.  You have to
> measure very careful; it may not even be a bottleneck(!) :)
>
> I hit a pretty significant power issue when I enable HyperThreading at
> multi-thousand node scale.  We hit a ~8-10% power utilization increase,
> which, if rolled out to the entire cluster, would put me a few %'ge over
> our max spec power. In this case, for our workload, we actually saw a 15%
> increase in processing throughput / job latency.   We ended up literally
> turning off machines and enabling HyperThreading on the remaining and saw
> an overall ~10% efficiency gain in the cluster, with a few less machines,
> but running hot on power.
>
> ~Brad
>
> -----Original Message-----
> From: Terry Healy [mailto:[EMAIL PROTECTED]]
> Sent: Tuesday, February 5, 2013 7:20 AM
> To: [EMAIL PROTECTED]
> Subject: HyperThreading in TaskTracker nodes?
>
> I would like to get some opinions / recommendations about the pros and
> cons of enabling HyperThreading on TaskTracker nodes. Presumably memory
> could be an issue, but is there anything to be gained, perhaps because of
> I/O wait? My small cluster is made of relatively slow and old systems,
> which mostly are quite slow to/from disk, if that matters.
>
> Thanks,
>
> Terry
>
>
>
>
--
Todd Lipcon
Software Engineer, Cloudera
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB