I have a quick question regarding RAID0 performances vs multiple
Let's say I have 2 x 2TB drives.
I can configure them as 2 separate drives mounted on 2 folders and
assignes to hadoop using dfs.data.dir. Or I can mount the 2 drives
with RAID0 and assigned them as a single folder to dfs.data.dir.
With RAID0, the reads and writes are going to be spread over the 2
disks. This is significantly increasing the speed. But if I put 2
entries in dfs.data.dir, hadoop is going to spread over those 2
directories too, and at the end, ths results should the same, no?
Any experience/advice/results to share?
Michael Katzenellenbogen 2013-02-11, 02:12
Jean-Marc Spaggiari 2013-02-11, 02:19
Jean-Marc Spaggiari 2013-02-11, 15:54
Michael Katzenellenbogen 2013-02-11, 16:02
Marcos Ortiz 2013-02-11, 02:39