|
Jean-Marc Spaggiari
2013-02-11, 01:57
Michael Katzenellenbogen
2013-02-11, 02:12
Jean-Marc Spaggiari
2013-02-11, 02:19
Marcos Ortiz
2013-02-11, 02:39
Jean-Marc Spaggiari
2013-02-11, 15:54
Michael Katzenellenbogen
2013-02-11, 16:02
|
-
Mutiple dfs.data.dir vs RAID0Jean-Marc Spaggiari 2013-02-11, 01:57
Hi,
I have a quick question regarding RAID0 performances vs multiple dfs.data.dir entries. Let's say I have 2 x 2TB drives. I can configure them as 2 separate drives mounted on 2 folders and assignes to hadoop using dfs.data.dir. Or I can mount the 2 drives with RAID0 and assigned them as a single folder to dfs.data.dir. With RAID0, the reads and writes are going to be spread over the 2 disks. This is significantly increasing the speed. But if I put 2 entries in dfs.data.dir, hadoop is going to spread over those 2 directories too, and at the end, ths results should the same, no? Any experience/advice/results to share? Thanks, JM
-
Re: Mutiple dfs.data.dir vs RAID0Michael Katzenellenbogen 2013-02-11, 02:12
One thought comes to mind: disk failure. In the event a disk goes bad,
then with RAID0, you just lost your entire array. With JBOD, you lost one disk. -Michael On Feb 10, 2013, at 8:58 PM, Jean-Marc Spaggiari <[EMAIL PROTECTED]> wrote: > Hi, > > I have a quick question regarding RAID0 performances vs multiple > dfs.data.dir entries. > > Let's say I have 2 x 2TB drives. > > I can configure them as 2 separate drives mounted on 2 folders and > assignes to hadoop using dfs.data.dir. Or I can mount the 2 drives > with RAID0 and assigned them as a single folder to dfs.data.dir. > > With RAID0, the reads and writes are going to be spread over the 2 > disks. This is significantly increasing the speed. But if I put 2 > entries in dfs.data.dir, hadoop is going to spread over those 2 > directories too, and at the end, ths results should the same, no? > > Any experience/advice/results to share? > > Thanks, > > JM
-
Re: Mutiple dfs.data.dir vs RAID0Jean-Marc Spaggiari 2013-02-11, 02:19
The issue is that my MB is not doing JBOD :( I have RAID only
possible, and I'm fighting for the last 48h and still not able to make it work... That's why I'm thinking about using dfs.data.dir instead. I have 1 drive per node so far and need to move to 2 to reduce WIO. What will be better with JBOD against dfs.data.dir? I have done some tests JBOD vs LVM and did not find any pros for JBOD so far. JM 2013/2/10, Michael Katzenellenbogen <[EMAIL PROTECTED]>: > One thought comes to mind: disk failure. In the event a disk goes bad, > then with RAID0, you just lost your entire array. With JBOD, you lost > one disk. > > -Michael > > On Feb 10, 2013, at 8:58 PM, Jean-Marc Spaggiari > <[EMAIL PROTECTED]> wrote: > >> Hi, >> >> I have a quick question regarding RAID0 performances vs multiple >> dfs.data.dir entries. >> >> Let's say I have 2 x 2TB drives. >> >> I can configure them as 2 separate drives mounted on 2 folders and >> assignes to hadoop using dfs.data.dir. Or I can mount the 2 drives >> with RAID0 and assigned them as a single folder to dfs.data.dir. >> >> With RAID0, the reads and writes are going to be spread over the 2 >> disks. This is significantly increasing the speed. But if I put 2 >> entries in dfs.data.dir, hadoop is going to spread over those 2 >> directories too, and at the end, ths results should the same, no? >> >> Any experience/advice/results to share? >> >> Thanks, >> >> JM >
-
Re: Mutiple dfs.data.dir vs RAID0Marcos Ortiz 2013-02-11, 02:39
We have seen in several of our Hadoop clusters that LVM degrades
performance of our M/R jobs, and I remembered a message where Ted Dunning was explaining something about this, and since that time, we don't use LVM for Hadoop data directories. About RAID volumes, the best performance that we have achieved is using RAID 10 for our Hadoop data directories. On 02/10/2013 09:24 PM, Michael Katzenellenbogen wrote: > Are you able to create multiple RAID0 volumes? Perhaps you can expose > each disk as its own RAID0 volume... > > Not sure why or where LVM comes into the picture here ... LVM is on > the software layer and (hopefully) the RAID/JBOD stuff is at the > hardware layer (and in the case of HDFS, LVM will only add unneeded > overhead). > > -Michael > > On Feb 10, 2013, at 9:19 PM, Jean-Marc Spaggiari > <[EMAIL PROTECTED]> wrote: > >> The issue is that my MB is not doing JBOD :( I have RAID only >> possible, and I'm fighting for the last 48h and still not able to make >> it work... That's why I'm thinking about using dfs.data.dir instead. >> >> I have 1 drive per node so far and need to move to 2 to reduce WIO. >> >> What will be better with JBOD against dfs.data.dir? I have done some >> tests JBOD vs LVM and did not find any pros for JBOD so far. >> >> JM >> >> 2013/2/10, Michael Katzenellenbogen <[EMAIL PROTECTED]>: >>> One thought comes to mind: disk failure. In the event a disk goes bad, >>> then with RAID0, you just lost your entire array. With JBOD, you lost >>> one disk. >>> >>> -Michael >>> >>> On Feb 10, 2013, at 8:58 PM, Jean-Marc Spaggiari >>> <[EMAIL PROTECTED]> wrote: >>> >>>> Hi, >>>> >>>> I have a quick question regarding RAID0 performances vs multiple >>>> dfs.data.dir entries. >>>> >>>> Let's say I have 2 x 2TB drives. >>>> >>>> I can configure them as 2 separate drives mounted on 2 folders and >>>> assignes to hadoop using dfs.data.dir. Or I can mount the 2 drives >>>> with RAID0 and assigned them as a single folder to dfs.data.dir. >>>> >>>> With RAID0, the reads and writes are going to be spread over the 2 >>>> disks. This is significantly increasing the speed. But if I put 2 >>>> entries in dfs.data.dir, hadoop is going to spread over those 2 >>>> directories too, and at the end, ths results should the same, no? >>>> >>>> Any experience/advice/results to share? >>>> >>>> Thanks, >>>> >>>> JM -- Marcos Ortiz Valmaseda, Product Manager && Data Scientist at UCI Blog: http://marcosluis2186.posterous.com Twitter: @marcosluis2186 <http://twitter.com/marcosluis2186>
-
Re: Mutiple dfs.data.dir vs RAID0Jean-Marc Spaggiari 2013-02-11, 15:54
thanks all for your feebacks.
I have updated with hdfs config to add another dfs.data.dir entry and restarted the node. Hadoop is starting to use the entry, but is not spreading the existing data over the 2 directories. Let's say you have a 2TB disk on /hadoop1, almost full. If you add another 2TB disk on /hadoop2 and add it on dfs.data.dir, hadoop will start to write into /hadoop1 and /hadoop2, but /hadoop1 will stay almost full. It will not balance the already existing data over the 2 directories. I have deleted all the content of /hadoop1 and /hadoop2 and restarted the node and now the data is spread over the 2. Just need to wait for the replication to complete. So what I will do instead is, I will add 2 x 2TB drives, mount them as raid0 then move the existing data into this drive and remove the reprious one. That way hadoop will see still one directory under /hadoop1 but it will be 4TB instead of 2TB... Is there anywhere where I can read about hadoop vs the different kind of physical data storage configuration? (Book, web, etc.) JM 2013/2/11, Ted Dunning <[EMAIL PROTECTED]>: > Typical best practice is to have a separate file system per spindle. If > you have a RAID only controller (many are), then you just create one RAID > per spindle. The effect is the same. > > MapR is unusual able to stripe writes over multiple drives organized into a > storage pool, but you will not normally be able to achieve that same level > of performance with ordinary Hadoop by using LVM over JBOD or controller > level RAID. The problem is that the Java layer doesn't understand that the > storage is striped and the controller doesn't understand what Hadoop is > doing. MapR schedules all of the writes to individual spindles via a very > fast state machine embedded in the file system. > > The comment about striping increasing the impact of a single disk drive is > exactly correct and it makes modeling the failure modes of the system > considerably more complex. The net result of the modeling that I and > others have done is that moderate to large RAID groups in storage pools for > moderate sized clusters (< 2000 nodes or so) is just fine. For large > clusters of up to 10,000 nodes, you should probably limit RAID groups to 4 > drives or less. > > On Sun, Feb 10, 2013 at 7:39 PM, Marcos Ortiz <[EMAIL PROTECTED]> wrote: > >> We have seen in several of our Hadoop clusters that LVM degrades >> performance of our M/R jobs, and I remembered a message where >> Ted Dunning was explaining something about this, and since >> that time, we don't use LVM for Hadoop data directories. >> >> About RAID volumes, the best performance that we have achieved >> is using RAID 10 for our Hadoop data directories. >> >> >> >> On 02/10/2013 09:24 PM, Michael Katzenellenbogen wrote: >> >> Are you able to create multiple RAID0 volumes? Perhaps you can expose >> each disk as its own RAID0 volume... >> >> Not sure why or where LVM comes into the picture here ... LVM is on >> the software layer and (hopefully) the RAID/JBOD stuff is at the >> hardware layer (and in the case of HDFS, LVM will only add unneeded >> overhead). >> >> -Michael >> >> On Feb 10, 2013, at 9:19 PM, Jean-Marc Spaggiari<[EMAIL PROTECTED]> >> <[EMAIL PROTECTED]> wrote: >> >> >> The issue is that my MB is not doing JBOD :( I have RAID only >> possible, and I'm fighting for the last 48h and still not able to make >> it work... That's why I'm thinking about using dfs.data.dir instead. >> >> I have 1 drive per node so far and need to move to 2 to reduce WIO. >> >> What will be better with JBOD against dfs.data.dir? I have done some >> tests JBOD vs LVM and did not find any pros for JBOD so far. >> >> JM >> >> 2013/2/10, Michael Katzenellenbogen <[EMAIL PROTECTED]> >> <[EMAIL PROTECTED]>: >> >> One thought comes to mind: disk failure. In the event a disk goes bad, >> then with RAID0, you just lost your entire array. With JBOD, you lost >> one disk. >> >> -Michael >> >> On Feb 10, 2013, at 8:58 PM, Jean-Marc Spaggiari<[EMAIL PROTECTED]>
-
Re: Mutiple dfs.data.dir vs RAID0Michael Katzenellenbogen 2013-02-11, 16:02
On Mon, Feb 11, 2013 at 10:54 AM, Jean-Marc Spaggiari <
[EMAIL PROTECTED]> wrote: > thanks all for your feebacks. > > I have updated with hdfs config to add another dfs.data.dir entry and > restarted the node. Hadoop is starting to use the entry, but is not > spreading the existing data over the 2 directories. > > Let's say you have a 2TB disk on /hadoop1, almost full. If you add > another 2TB disk on /hadoop2 and add it on dfs.data.dir, hadoop will > start to write into /hadoop1 and /hadoop2, but /hadoop1 will stay > almost full. It will not balance the already existing data over the 2 > directories. > > I have deleted all the content of /hadoop1 and /hadoop2 and restarted > the node and now the data is spread over the 2. Just need to wait for > the replication to complete. > > So what I will do instead is, I will add 2 x 2TB drives, mount them as > raid0 then move the existing data into this drive and remove the > reprious one. That way hadoop will see still one directory under > /hadoop1 but it will be 4TB instead of 2TB... > > Is there anywhere where I can read about hadoop vs the different kind > of physical data storage configuration? (Book, web, etc.) > "Hadoop Operations" by E. Sammer: http://shop.oreilly.com/product/0636920025085.do > > JM > > 2013/2/11, Ted Dunning <[EMAIL PROTECTED]>: > > Typical best practice is to have a separate file system per spindle. If > > you have a RAID only controller (many are), then you just create one RAID > > per spindle. The effect is the same. > > > > MapR is unusual able to stripe writes over multiple drives organized > into a > > storage pool, but you will not normally be able to achieve that same > level > > of performance with ordinary Hadoop by using LVM over JBOD or controller > > level RAID. The problem is that the Java layer doesn't understand that > the > > storage is striped and the controller doesn't understand what Hadoop is > > doing. MapR schedules all of the writes to individual spindles via a > very > > fast state machine embedded in the file system. > > > > The comment about striping increasing the impact of a single disk drive > is > > exactly correct and it makes modeling the failure modes of the system > > considerably more complex. The net result of the modeling that I and > > others have done is that moderate to large RAID groups in storage pools > for > > moderate sized clusters (< 2000 nodes or so) is just fine. For large > > clusters of up to 10,000 nodes, you should probably limit RAID groups to > 4 > > drives or less. > > > > On Sun, Feb 10, 2013 at 7:39 PM, Marcos Ortiz <[EMAIL PROTECTED]> wrote: > > > >> We have seen in several of our Hadoop clusters that LVM degrades > >> performance of our M/R jobs, and I remembered a message where > >> Ted Dunning was explaining something about this, and since > >> that time, we don't use LVM for Hadoop data directories. > >> > >> About RAID volumes, the best performance that we have achieved > >> is using RAID 10 for our Hadoop data directories. > >> > >> > >> > >> On 02/10/2013 09:24 PM, Michael Katzenellenbogen wrote: > >> > >> Are you able to create multiple RAID0 volumes? Perhaps you can expose > >> each disk as its own RAID0 volume... > >> > >> Not sure why or where LVM comes into the picture here ... LVM is on > >> the software layer and (hopefully) the RAID/JBOD stuff is at the > >> hardware layer (and in the case of HDFS, LVM will only add unneeded > >> overhead). > >> > >> -Michael > >> > >> On Feb 10, 2013, at 9:19 PM, Jean-Marc Spaggiari< > [EMAIL PROTECTED]> > >> <[EMAIL PROTECTED]> wrote: > >> > >> > >> The issue is that my MB is not doing JBOD :( I have RAID only > >> possible, and I'm fighting for the last 48h and still not able to make > >> it work... That's why I'm thinking about using dfs.data.dir instead. > >> > >> I have 1 drive per node so far and need to move to 2 to reduce WIO. > >> > >> What will be better with JBOD against dfs.data.dir? I have done some > >> tests JBOD vs LVM and did not find any pros for JBOD so far. |