We're running a small 30 node cluster and in a few days will reinstall the
whole software, thus I want to change HDD configuration that was done long
time ago and seems to be inefficient - each node has 2x1TB drives that are
LVMed to one single volume.
How do people usually setup drives? For example will it be better to mount
them to two separate folders and feed these folder to both tasktracker and
datanode? Or setup LVM with raid 0 to maximize bandwidth.
What I want is that 2TB of drive space per node were equally accessible to
both tasktracker and datanode, and I'm not sure that mounting two drives to
separate folders achieves that. (for example if reducer fills one drive
will it start writing the rest of the data to second drive? )