Ted Dunning 2013-01-18, 22:10
Jeffrey Buell 2013-01-19, 01:01
Jeff makes some good points here.
On Fri, Jan 18, 2013 at 5:01 PM, Jeffrey Buell <[EMAIL PROTECTED]> wrote:
> I disagree. There are some significant advantages to using "many small
> nodes" instead of "few big nodes". As Ted points out, there are some
> disadvantages as well, so you have to look at the trade-offs. But consider:
> - NUMA: If your hadoop nodes span physical NUMA nodes, then performance
> will suffer from remote memory accesses. The Linux scheduler tries to
> minimize this, but I've found that about 1/3 of memory accesses are remote
> on a 2-socket machine. This effect will be more severe on bigger
> machines. Hadoop nodes that fit on a NUMA node will have not access remote
> memory at all (at least on vSphere).
This is definitely a good point with respect to untainted Hadoop, but with
a system like MapR, there is a significant amount of core locality that
goes on to minimize NUMA-remote fetches. This can have significant impact,
- Disk partitioning: Smaller nodes with fewer disks each can significantly
> increase average disk utilization, not decrease it. Having many threads
> operating against many disks in the "big node" case tends to leave some
> disks idle while others are over-subscribed.
Again, this is an implementation side-effect. Good I/O scheduling and
proper striping can mitigate this substantially.
Going the other way, splitting disks between different VM's can be
> Partitioning disks among nodes decreases this effect. The extreme case
> is one disk per node, where no disks will be idle as long as there is work
> to do.
Yes. Even deficient implementations should succeed in this case.
You do lose the ability to allow big-memory jobs that would otherwise span
> - Management: Not a performance effect, but smaller nodes enable easier
> multi-tenancy, multiple virtual Hadoop clusters, sharing physical hardware
> with other workloads, etc.