Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HDFS >> mail # user >> Mutiple dfs.data.dir vs RAID0


+
Jean-Marc Spaggiari 2013-02-11, 01:57
+
Michael Katzenellenbogen 2013-02-11, 02:12
Copy link to this message
-
Re: Mutiple dfs.data.dir vs RAID0
The issue is that my MB is not doing JBOD :( I have RAID only
possible, and I'm fighting for the last 48h and still not able to make
it work... That's why I'm thinking about using dfs.data.dir instead.

I have 1 drive per node so far and need to move to 2 to reduce WIO.

What will be better with JBOD against dfs.data.dir? I have done some
tests JBOD vs LVM and did not find any pros for JBOD so far.

JM

2013/2/10, Michael Katzenellenbogen <[EMAIL PROTECTED]>:
> One thought comes to mind: disk failure. In the event a disk goes bad,
> then with RAID0, you just lost your entire array. With JBOD, you lost
> one disk.
>
> -Michael
>
> On Feb 10, 2013, at 8:58 PM, Jean-Marc Spaggiari
> <[EMAIL PROTECTED]> wrote:
>
>> Hi,
>>
>> I have a quick question regarding RAID0 performances vs multiple
>> dfs.data.dir entries.
>>
>> Let's say I have 2 x 2TB drives.
>>
>> I can configure them as 2 separate drives mounted on 2 folders and
>> assignes to hadoop using dfs.data.dir. Or I can mount the 2 drives
>> with RAID0 and assigned them as a single folder to dfs.data.dir.
>>
>> With RAID0, the reads and writes are going to be spread over the 2
>> disks. This is significantly increasing the speed. But if I put 2
>> entries in dfs.data.dir, hadoop is going to spread over those 2
>> directories too, and at the end, ths results should the same, no?
>>
>> Any experience/advice/results to share?
>>
>> Thanks,
>>
>> JM
>
+
Jean-Marc Spaggiari 2013-02-11, 15:54
+
Michael Katzenellenbogen 2013-02-11, 16:02
+
Marcos Ortiz 2013-02-11, 02:39