|
|
-
Re: one or more file systemArun C Murthy 2012-10-16, 23:55
Can you guys pls move this discussion to user@? Thanks.
On Oct 16, 2012, at 4:45 PM, Andy Isaacson wrote: > RAID5 is suboptimal for HDFS due to the spindle imbalance issue (among > other problems). Read this paper for details: > > "Disks are like Snowflakes: No Two Are Alike" > www.usenix.org/event/hotos11/tech/final_files/Krevat.pdf > > For best performance configure your storage as JBOD instead of RAID, > format each spindle as a separate ext4 filesystem, and put a datadir > on each spindle. > > Your disk array will have a configuration utility to set JBOD instead > of RAID. Please consult the documentation for your disk array for the > details. > > If you must use RAID5 then one filesystem and one datadir is your best option. > > For *BAD* performance, put multiple logical volumes on a single RAID > and put multiple datadirs on the RAID. This will result in low IOPS, > low throughput, and high contention. > > -andy > > On Tue, Oct 9, 2012 at 2:13 AM, Xiang Hua <[EMAIL PROTECTED]> wrote: >> Hi, >> but how to "configure disk array as JBOD", we plan to use disk array >> with RAID5 and make LUN of 1T. >> so we have many LUN of the size of 1T. and we mkfs on every LUN,so we >> have 12 fs /data1...../data12, which will be put into HDFS. >> >> >> Best R. >> >> beatls >> >> On Tue, Oct 9, 2012 at 1:45 AM, Andy Isaacson <[EMAIL PROTECTED]> wrote: >> >>> On Mon, Oct 8, 2012 at 8:30 AM, Xiang Hua <[EMAIL PROTECTED]> wrote: >>>> Hi, >>>> we have 4T disk from a diskarray. >>>> i want to split 2T*1 to 1T*2, then add to HDFS, which leads to more >>>> local storage directories. >>>> this time we have 12 local directories(1T), is ti harmful to hdfs >>>> performance? >>> >>> Assuming you're running a modern Hadoop on a recent Linux (2.6.38 or >>> later, or RHEL6): >>> >>> For best performance you should configure your disk array as JBOD >>> rather than RAID, then put one ext4 filesystem on each spindle. Do not >>> put multiple storage directories on a single spindle, that results in >>> very bad performance and no benefit over a single storage directory >>> per spindle. And do not put multiple spindles under a single storage >>> directory, that results in poor utilization and bad performance with >>> no significant benefit. >>> >>> 12 local storage directories will perform just fine assuming you have >>> enough CPU power to use them. >>> >>> -andy >>> -- Arun C. Murthy Hortonworks Inc. http://hortonworks.com/ |