|
Tom Deutsch
2012-10-17, 13:31
Pamecha, Abhishek
2012-10-18, 00:21
Luca Pireddu
2012-10-18, 12:32
Tom Deutsch
2012-10-18, 14:37
Pamecha, Abhishek
2012-10-18, 15:18
Jitendra Kumar Singh
2012-10-18, 13:48
Michael Segel
2012-10-18, 13:58
Pamecha, Abhishek
2012-10-18, 15:08
seth
2012-10-18, 15:15
Zhani Pellumbi
2012-10-18, 15:46
Steve Loughran
2012-10-19, 08:06
Pamecha, Abhishek
2012-10-19, 00:29
Pamecha, Abhishek
2012-10-16, 18:28
Jeffrey Buell
2012-10-16, 21:24
lohit
2012-10-16, 22:26
Pamecha, Abhishek
2012-10-16, 23:28
Kevin O'dell
2012-10-17, 13:25
Mohamed Riadh Trad
2012-10-17, 13:37
Pamecha, Abhishek
2012-10-18, 00:26
|
-
Re: HDFS using SANTom Deutsch 2012-10-17, 13:31
And of source IBM has supported our GPFS and SONAS customers for a couple of years already.
--------------------------------------- Sent from my Blackberry so please excuse typing and spelling errors. ----- Original Message ----- From: "Kevin O'dell" [[EMAIL PROTECTED]] Sent: 10/17/2012 09:25 AM AST To: [EMAIL PROTECTED] Subject: Re: HDFS using SAN You may want to take a look at the Netapp White Paper on this. They have a SAN solution as their Hadoop offering. http://www.netapp.com/templates/mediaView?m=tr-3969.pdf&cc=us&wid=130618138&mid=56872393 On Tue, Oct 16, 2012 at 7:28 PM, Pamecha, Abhishek <[EMAIL PROTECTED]> wrote: > Yes, for MR, my impression is typically the n/w utilization is next to > none during map and reduce tasks but jumps during shuffle. With a SAN, I > would assume there is no such separation. There will be network activity > all over the job’s time window with shuffle probably doing more than what > it should. **** > > ** ** > > Moreover, I hear typically SANs by default, would split data in different > physical disks [even w/o RAID], so contiguity is lost. But I have no idea > on if that is a good thing or bad. Looks bad on the surface, but probably > depends on how much parallelized data fetches from multiple physical disks > can be done by a SAN efficiently. Any comments on this aspect?**** > > ** ** > > And yes, when the dataset volume increases and one needs to basically do > full table scan equivalents, I am assuming the n/w needs to support that > entire data move from SAN to the data node all in parallel to different > mappers.**** > > ** ** > > So what I am gathering is although storing data over SAN is possible for > a Hadoop installation, Map-shuffle-reduce may not be the best way to > process data in that env. Is this conclusion correct? **** > > ** ** > > <3 way Replication and RAID suggestions are great. **** > > ** ** > > Thanks,**** > > Abhishek**** > > ** ** > > *From:* lohit [mailto:[EMAIL PROTECTED]] > *Sent:* Tuesday, October 16, 2012 3:26 PM > *To:* [EMAIL PROTECTED] > *Subject:* Re: HDFS using SAN**** > > ** ** > > Adding to this. Locality is very important for MapReduce applications. One > might not see much of a difference for small MapReduce jobs running on > direct attached storage vs SAN, but when you cluster grows or you find jobs > which are heavy on IO, you would see quite a bit of difference. One thing > which is obviously is also cost difference. Argument for that has been that > SAN storage is much more reliable so you do not need default of 3 way > replication factor you would do on direct attached storage. **** > > ** ** > > 2012/10/16 Jeffrey Buell <[EMAIL PROTECTED]>**** > > It will be difficult to make a SAN work well for Hadoop, but not > impossible. I have done direct comparisons (but not published them yet). > Direct local storage is likely to have much more capacity and more total > bandwidth. But you can do pretty well with a SAN if you stuff it with the > highest-capacity disks and provide an independent 8 gb (FC) or 10 GbE > connection for every host. Watch out for overall SAN bandwidth limits > (which may well be much less than the sum of the capacity of the wires > connected to it). There will definitely be a hard limit to how many hosts > you connect to a single SAN. Scaling to larger clusters will require > multiple SANs.**** > > **** > > Locality is an issue. Even though each host has a direct physical access > to all the data, a “remote” access in HDFS will still have to go over the > network to the host that owns the data. “Local” access is fine with the > constraints above.**** > > **** > > RAID is not good for Hadoop performance for both local and SAN storage, so > you’ll want to configure one LUN for each physical disk in the SAN. If you > do have mirroring or RAID on the SAN, you may be tempted to use that to > replace Hadoop replication. But while the data is protected, access to the > data is lost if the datanode goes down. You can get around that by running Kevin O'Dell Customer Operations Engineer, Cloudera +
Tom Deutsch 2012-10-17, 13:31
-
RE: HDFS using SANPamecha, Abhishek 2012-10-18, 00:21
Tom
Do you mean you are using GPFS instead of HDFS? Also, if you can share, are you deploying it as DAS set up or a SAN? Thanks, Abhishek From: Tom Deutsch [mailto:[EMAIL PROTECTED]] Sent: Wednesday, October 17, 2012 6:31 AM To: user Subject: Re: HDFS using SAN And of source IBM has supported our GPFS and SONAS customers for a couple of years already. --------------------------------------- Sent from my Blackberry so please excuse typing and spelling errors. ________________________________ From: "Kevin O'dell" [[EMAIL PROTECTED]] Sent: 10/17/2012 09:25 AM AST To: [EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]> Subject: Re: HDFS using SAN You may want to take a look at the Netapp White Paper on this. They have a SAN solution as their Hadoop offering. http://www.netapp.com/templates/mediaView?m=tr-3969.pdf&cc=us&wid=130618138&mid=56872393 On Tue, Oct 16, 2012 at 7:28 PM, Pamecha, Abhishek <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote: Yes, for MR, my impression is typically the n/w utilization is next to none during map and reduce tasks but jumps during shuffle. With a SAN, I would assume there is no such separation. There will be network activity all over the job’s time window with shuffle probably doing more than what it should. Moreover, I hear typically SANs by default, would split data in different physical disks [even w/o RAID], so contiguity is lost. But I have no idea on if that is a good thing or bad. Looks bad on the surface, but probably depends on how much parallelized data fetches from multiple physical disks can be done by a SAN efficiently. Any comments on this aspect? And yes, when the dataset volume increases and one needs to basically do full table scan equivalents, I am assuming the n/w needs to support that entire data move from SAN to the data node all in parallel to different mappers. So what I am gathering is although storing data over SAN is possible for a Hadoop installation, Map-shuffle-reduce may not be the best way to process data in that env. Is this conclusion correct? <3 way Replication and RAID suggestions are great. Thanks, Abhishek From: lohit [mailto:[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>] Sent: Tuesday, October 16, 2012 3:26 PM To: [EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]> Subject: Re: HDFS using SAN Adding to this. Locality is very important for MapReduce applications. One might not see much of a difference for small MapReduce jobs running on direct attached storage vs SAN, but when you cluster grows or you find jobs which are heavy on IO, you would see quite a bit of difference. One thing which is obviously is also cost difference. Argument for that has been that SAN storage is much more reliable so you do not need default of 3 way replication factor you would do on direct attached storage. 2012/10/16 Jeffrey Buell <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> It will be difficult to make a SAN work well for Hadoop, but not impossible. I have done direct comparisons (but not published them yet). Direct local storage is likely to have much more capacity and more total bandwidth. But you can do pretty well with a SAN if you stuff it with the highest-capacity disks and provide an independent 8 gb (FC) or 10 GbE connection for every host. Watch out for overall SAN bandwidth limits (which may well be much less than the sum of the capacity of the wires connected to it). There will definitely be a hard limit to how many hosts you connect to a single SAN. Scaling to larger clusters will require multiple SANs. Locality is an issue. Even though each host has a direct physical access to all the data, a “remote” access in HDFS will still have to go over the network to the host that owns the data. “Local” access is fine with the constraints above. RAID is not good for Hadoop performance for both local and SAN storage, so you’ll want to configure one LUN for each physical disk in the SAN. If you do have mirroring or RAID on the SAN, you may be tempted to use that to replace Hadoop replication. But while the data is protected, access to the data is lost if the datanode goes down. You can get around that by running the datanode in a VM which is stored on the SAN and using VMware HA to automatically restart the VM on another host in case of a failure. Hortonworks has demonstrated this use-case but this strategy is a bit bleeding-edge. Jeff From: Pamecha, Abhishek [mailto:[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>] Sent: Tuesday, October 16, 2012 11:28 AM To: [EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]> Subject: HDFS using SAN Hi I have read scattered documentation across the net which mostly say HDFS doesn't go well with SAN being used to store data. While some say, it is an emerging trend. I would love to know if there have been any tests performed which hint on what aspects does a direct storage excels/falls behind a SAN. We are investigating whether a direct storage option is better than a SAN storage for a modest cluster with data in 100 TBs in steady state. The SAN of course can support order of magnitude more of iops we care about for now, but given it is a shared infrastructure and we may expand our data size, it may not be an advantage in the future. Another thing I am interested in: for MR jobs, where data locality is the key driver, how does that span out when using a SAN instead of direct storage? And of course on the subjective topics of availability and reliability on using a SAN for data storage in HDFS, I would love to receive your views. Thanks, Abhishek Have a Nice Day! Lohit Kevin O'Dell Customer Operations Engineer, Cloudera +
Pamecha, Abhishek 2012-10-18, 00:21
-
Re: HDFS using SANLuca Pireddu 2012-10-18, 12:32
On 10/18/2012 02:21 AM, Pamecha, Abhishek wrote:
> Tom > > Do you mean you are using GPFS instead of HDFS? Also, if you can share, > are you deploying it as DAS set up or a SAN? > > Thanks, > > Abhishek > Though I don't think I'd buy a SAN for a new Hadoop cluster, we have a SAN and are using it *instead of HDFS* with a small/medium Hadoop MapReduce cluster (up to 100 nodes or so, depending on our need). We still use the local node disks for intermediate data (mapred local storage). Although this set-up does limit our possibility to scale to a large number of nodes, that's not a concern for us. On the plus, we gain the flexibility to be able to share our cluster with non-Hadoop users at our centre. -- Luca Pireddu CRS4 - Distributed Computing Group Loc. Pixina Manna Edificio 1 09010 Pula (CA), Italy Tel: +39 0709250452 +
Luca Pireddu 2012-10-18, 12:32
-
Re: HDFS using SANTom Deutsch 2012-10-18, 14:37
Agreed Luca, we do this to support existing customers that have requested it and it works fine within obvious IO considerations. But not a recommended way to do a green field deployment. ------------------------------------------------ Tom Deutsch Program Director Information Management Big Data Technologies IBM 3565 Harbor Blvd Costa Mesa, CA 92626-1420 [EMAIL PROTECTED] Twitter: @thomasdeutsch Data Management Blog: ibmdatamag.com/author/tdeutsch/ LinkedIn: http://www.linkedin.com/profile/view?id=833160 Quora: http://www.quora.com/Tom-Deutsch Smarter Computing Blog: http://www.smartercomputingblog.com/contributorsprofile/?user_id=223 IBM Big Data Hub Blog: http://www.ibmbigdatahub.com/blog/author/tom-deutsch Big Data for Business Executives Group: http://www.linkedin.com/groups?gid=4455695 From: Luca Pireddu <[EMAIL PROTECTED]> To: [EMAIL PROTECTED], Date: 10/18/2012 05:33 AM Subject: Re: HDFS using SAN On 10/18/2012 02:21 AM, Pamecha, Abhishek wrote: > Tom > > Do you mean you are using GPFS instead of HDFS? Also, if you can share, > are you deploying it as DAS set up or a SAN? > > Thanks, > > Abhishek > Though I don't think I'd buy a SAN for a new Hadoop cluster, we have a SAN and are using it *instead of HDFS* with a small/medium Hadoop MapReduce cluster (up to 100 nodes or so, depending on our need). We still use the local node disks for intermediate data (mapred local storage). Although this set-up does limit our possibility to scale to a large number of nodes, that's not a concern for us. On the plus, we gain the flexibility to be able to share our cluster with non-Hadoop users at our centre. -- Luca Pireddu CRS4 - Distributed Computing Group Loc. Pixina Manna Edificio 1 09010 Pula (CA), Italy Tel: +39 0709250452 +
Tom Deutsch 2012-10-18, 14:37
-
Re: HDFS using SANPamecha, Abhishek 2012-10-18, 15:18
Yes, I have been reaching the same conclusions here. Tom would you care to spell out the 'obvious' io considerations? I would like to see if there are more that are different than mine.
My 3 observations have been that 1. for full tables scan MR jobs, SAN approach is transporting entire dataset over the n/w to data nodes. Not good. 2. The shuffle s actually includes more n/w transfers when it could have been just intra-SAN transfer. Disadvantage. 3. SAN controller caches ( an additional stop in data transfer as opposed to das) may not be utilized as effectively because they are shared by multiple data nodes. ( frequent eviction) So overall my conclusion is MR is not the best suited data processing method when data is stored in a SAN. Btw, I thought SAN would do block level transfer and file system on top is your choice. I was surprised to see GPFS 'as' the SAN. Could you please clarify? Any way you can share your cluster size? Thanks Abhishek i Sent from my iPad with iMstakes On Oct 18, 2012, at 7:41, "Tom Deutsch" <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote: Agreed Luca, we do this to support existing customers that have requested it and it works fine within obvious IO considerations. But not a recommended way to do a green field deployment. ------------------------------------------------ Tom Deutsch Program Director Information Management Big Data Technologies IBM 3565 Harbor Blvd Costa Mesa, CA 92626-1420 [EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]> Twitter: @thomasdeutsch Data Management Blog: ibmdatamag.com/author/tdeutsch/<http://ibmdatamag.com/author/tdeutsch/> LinkedIn: http://www.linkedin.com/profile/view?id=833160 Quora: http://www.quora.com/Tom-Deutsch Smarter Computing Blog: http://www.smartercomputingblog.com/contributorsprofile/?user_id=223 IBM Big Data Hub Blog: http://www.ibmbigdatahub.com/blog/author/tom-deutsch Big Data for Business Executives Group: http://www.linkedin.com/groups?gid=4455695 <graycol.gif>Luca Pireddu ---10/18/2012 05:33:48 AM---On 10/18/2012 02:21 AM, Pamecha, Abhishek wrote: > Tom From: Luca Pireddu <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> To: [EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>, Date: 10/18/2012 05:33 AM Subject: Re: HDFS using SAN ________________________________ On 10/18/2012 02:21 AM, Pamecha, Abhishek wrote: > Tom > > Do you mean you are using GPFS instead of HDFS? Also, if you can share, > are you deploying it as DAS set up or a SAN? > > Thanks, > > Abhishek > Though I don't think I'd buy a SAN for a new Hadoop cluster, we have a SAN and are using it *instead of HDFS* with a small/medium Hadoop MapReduce cluster (up to 100 nodes or so, depending on our need). We still use the local node disks for intermediate data (mapred local storage). Although this set-up does limit our possibility to scale to a large number of nodes, that's not a concern for us. On the plus, we gain the flexibility to be able to share our cluster with non-Hadoop users at our centre. -- Luca Pireddu CRS4 - Distributed Computing Group Loc. Pixina Manna Edificio 1 09010 Pula (CA), Italy Tel: +39 0709250452 +
Pamecha, Abhishek 2012-10-18, 15:18
-
Re: HDFS using SANJitendra Kumar Singh 2012-10-18, 13:48
Hi,
In the NetApp whitepaper on SAN solution (link given by Kevin) it makes following statement. Can someone please elaborate (or give a link that explains) how 12-disk in SAN can give 2000 IOPS while if used as JBOD would give 600 IOPS? "The E2660 can deliver up to 2,000 IOPS from a 12-disk stripe (the bottleneck being the 12 disks). This headroom translates into better read times for those 64KB blocks. Twelve copies of 12 MapReduce jobs reading from 12 SATA disks can at best never exceed 12 x 50 IOPS, or 600 IOPS. The E2660 volume has five times the IOPS headroom, which translates into faster read times and high MapReduce throughput " Thanks and Regards, -- Jitendra Kumar Singh On Thu, Oct 18, 2012 at 6:02 PM, Luca Pireddu <[EMAIL PROTECTED]> wrote: > On 10/18/2012 02:21 AM, Pamecha, Abhishek wrote: > >> Tom >> >> Do you mean you are using GPFS instead of HDFS? Also, if you can share, >> are you deploying it as DAS set up or a SAN? >> >> Thanks, >> >> Abhishek >> >> > > Though I don't think I'd buy a SAN for a new Hadoop cluster, we have a SAN > and are using it *instead of HDFS* with a small/medium Hadoop MapReduce > cluster (up to 100 nodes or so, depending on our need). We still use the > local node disks for intermediate data (mapred local storage). Although > this set-up does limit our possibility to scale to a large number of nodes, > that's not a concern for us. On the plus, we gain the flexibility to be > able to share our cluster with non-Hadoop users at our centre. > > > -- > Luca Pireddu > CRS4 - Distributed Computing Group > Loc. Pixina Manna Edificio 1 > 09010 Pula (CA), Italy > Tel: +39 0709250452 > +
Jitendra Kumar Singh 2012-10-18, 13:48
-
Re: HDFS using SANMichael Segel 2012-10-18, 13:58
I haven't played with a NetApp box, but the way it has been explained to me is that your SAN appears as if its direct attached storage.
Its possible, based on drives and other hardware, plus it looks like they are focusing on read times only. I'd contact a NetApp rep for a better answer. Actually if you are looking at a higher density in terms of storage, going with a storage / compute cluster makes sense. On Oct 18, 2012, at 8:48 AM, Jitendra Kumar Singh <[EMAIL PROTECTED]> wrote: > Hi, > > In the NetApp whitepaper on SAN solution (link given by Kevin) it makes following statement. Can someone please elaborate (or give a link that explains) how 12-disk in SAN can give 2000 IOPS while if used as JBOD would give 600 IOPS? > > "The E2660 can deliver up to 2,000 IOPS > from a 12-disk stripe (the bottleneck being the 12 disks). This headroom translates into better read times > for those 64KB blocks. Twelve copies of 12 MapReduce jobs reading from 12 SATA disks can at best > never exceed 12 x 50 IOPS, or 600 IOPS. The E2660 volume has five times the IOPS headroom, which > translates into faster read times and high MapReduce throughput " > > Thanks and Regards, > -- > Jitendra Kumar Singh > > > > On Thu, Oct 18, 2012 at 6:02 PM, Luca Pireddu <[EMAIL PROTECTED]> wrote: > On 10/18/2012 02:21 AM, Pamecha, Abhishek wrote: > Tom > > Do you mean you are using GPFS instead of HDFS? Also, if you can share, > are you deploying it as DAS set up or a SAN? > > Thanks, > > Abhishek > > > > Though I don't think I'd buy a SAN for a new Hadoop cluster, we have a SAN and are using it *instead of HDFS* with a small/medium Hadoop MapReduce cluster (up to 100 nodes or so, depending on our need). We still use the local node disks for intermediate data (mapred local storage). Although this set-up does limit our possibility to scale to a large number of nodes, that's not a concern for us. On the plus, we gain the flexibility to be able to share our cluster with non-Hadoop users at our centre. > > > -- > Luca Pireddu > CRS4 - Distributed Computing Group > Loc. Pixina Manna Edificio 1 > 09010 Pula (CA), Italy > Tel: +39 0709250452 > +
Michael Segel 2012-10-18, 13:58
-
Re: HDFS using SANPamecha, Abhishek 2012-10-18, 15:08
Yes, I had similar views from the netapp paper. My usecase is io heavy and that's why ( atleast IMO), when data set grows, a shared SAN begins to make less sense as opposed to DAS for MR type of jobs.
As Lucas pointed out, sharing the same data with other apps is a great adv. w SAN. Thanks Abhishek i Sent from my iPad with iMstakes On Oct 18, 2012, at 6:59, "Michael Segel" <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote: I haven't played with a NetApp box, but the way it has been explained to me is that your SAN appears as if its direct attached storage. Its possible, based on drives and other hardware, plus it looks like they are focusing on read times only. I'd contact a NetApp rep for a better answer. Actually if you are looking at a higher density in terms of storage, going with a storage / compute cluster makes sense. On Oct 18, 2012, at 8:48 AM, Jitendra Kumar Singh <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote: Hi, In the NetApp whitepaper on SAN solution (link given by Kevin) it makes following statement. Can someone please elaborate (or give a link that explains) how 12-disk in SAN can give 2000 IOPS while if used as JBOD would give 600 IOPS? "The E2660 can deliver up to 2,000 IOPS from a 12-disk stripe (the bottleneck being the 12 disks). This headroom translates into better read times for those 64KB blocks. Twelve copies of 12 MapReduce jobs reading from 12 SATA disks can at best never exceed 12 x 50 IOPS, or 600 IOPS. The E2660 volume has five times the IOPS headroom, which translates into faster read times and high MapReduce throughput " Thanks and Regards, -- Jitendra Kumar Singh On Thu, Oct 18, 2012 at 6:02 PM, Luca Pireddu <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote: On 10/18/2012 02:21 AM, Pamecha, Abhishek wrote: Tom Do you mean you are using GPFS instead of HDFS? Also, if you can share, are you deploying it as DAS set up or a SAN? Thanks, Abhishek Though I don't think I'd buy a SAN for a new Hadoop cluster, we have a SAN and are using it *instead of HDFS* with a small/medium Hadoop MapReduce cluster (up to 100 nodes or so, depending on our need). We still use the local node disks for intermediate data (mapred local storage). Although this set-up does limit our possibility to scale to a large number of nodes, that's not a concern for us. On the plus, we gain the flexibility to be able to share our cluster with non-Hadoop users at our centre. -- Luca Pireddu CRS4 - Distributed Computing Group Loc. Pixina Manna Edificio 1 09010 Pula (CA), Italy Tel: +39 0709250452 +
Pamecha, Abhishek 2012-10-18, 15:08
-
Re: HDFS using SANseth 2012-10-18, 15:15
I wonder if large NAS equipment manufacturers have ever considered modifying their firmware to directly talk the DFS protocol that hadoop uses. This way your compute nodes could be 'pure' compute nodes with only tasktracker processes.
Might be a way to extend their market a bit. Not sure it would actually perform well until it was tried. On Oct 18, 2012, at 10:08 AM, "Pamecha, Abhishek" <[EMAIL PROTECTED]> wrote: > Yes, I had similar views from the netapp paper. My usecase is io heavy and that's why ( atleast IMO), when data set grows, a shared SAN begins to make less sense as opposed to DAS for MR type of jobs. > > As Lucas pointed out, sharing the same data with other apps is a great adv. w SAN. > > Thanks > Abhishek > > > i Sent from my iPad with iMstakes > > On Oct 18, 2012, at 6:59, "Michael Segel" <[EMAIL PROTECTED]> wrote: > > I haven't played with a NetApp box, but the way it has been explained to me is that your SAN appears as if its direct attached storage. > Its possible, based on drives and other hardware, plus it looks like they are focusing on read times only. > > I'd contact a NetApp rep for a better answer. > > Actually if you are looking at a higher density in terms of storage, going with a storage / compute cluster makes sense. > > On Oct 18, 2012, at 8:48 AM, Jitendra Kumar Singh <[EMAIL PROTECTED]> wrote: > >> Hi, >> >> In the NetApp whitepaper on SAN solution (link given by Kevin) it makes following statement. Can someone please elaborate (or give a link that explains) how 12-disk in SAN can give 2000 IOPS while if used as JBOD would give 600 IOPS? >> >> "The E2660 can deliver up to 2,000 IOPS >> from a 12-disk stripe (the bottleneck being the 12 disks). This headroom translates into better read times >> for those 64KB blocks. Twelve copies of 12 MapReduce jobs reading from 12 SATA disks can at best >> never exceed 12 x 50 IOPS, or 600 IOPS. The E2660 volume has five times the IOPS headroom, which >> translates into faster read times and high MapReduce throughput " >> >> Thanks and Regards, >> -- >> Jitendra Kumar Singh >> >> >> >> On Thu, Oct 18, 2012 at 6:02 PM, Luca Pireddu <[EMAIL PROTECTED]> wrote: >> On 10/18/2012 02:21 AM, Pamecha, Abhishek wrote: >> Tom >> >> Do you mean you are using GPFS instead of HDFS? Also, if you can share, >> are you deploying it as DAS set up or a SAN? >> >> Thanks, >> >> Abhishek >> >> >> >> Though I don't think I'd buy a SAN for a new Hadoop cluster, we have a SAN and are using it *instead of HDFS* with a small/medium Hadoop MapReduce cluster (up to 100 nodes or so, depending on our need). We still use the local node disks for intermediate data (mapred local storage). Although this set-up does limit our possibility to scale to a large number of nodes, that's not a concern for us. On the plus, we gain the flexibility to be able to share our cluster with non-Hadoop users at our centre. >> >> >> -- >> Luca Pireddu >> CRS4 - Distributed Computing Group >> Loc. Pixina Manna Edificio 1 >> 09010 Pula (CA), Italy >> Tel: +39 0709250452 >> > +
seth 2012-10-18, 15:15
-
Re: HDFS using SANZhani Pellumbi 2012-10-18, 15:46
Yes, Isilon NAS runs HDFS natively- thus your nodes become "compute" nodes, running only task tracker processes.
I read the NetApp paper, and this is fundamentally different architecture though. There are some obvious benefits , being able to scale out your storage layer independently from your compute layer, also since Isilon contains a large number of our datasets, it allows us to analyze that data in place without ingesting it into a diff location. Also because of Isilons OneFS filesystem, your name node is distributed across the entire Isilon cluster. However isilons documentation is lacking on this :( We are currently in the early stages of testing this architecture, and cannot accurately speak on the performance of one vs the other yet. I wonder if anyone else is using Isilon to run HDFS and can add some more details :) Regards Zhani Pellumbi From: seth <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> Reply-To: <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> Date: Thursday, October 18, 2012 11:15 AM To: <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> Subject: Re: HDFS using SAN I wonder if large NAS equipment manufacturers have ever considered modifying their firmware to directly talk the DFS protocol that hadoop uses. This way your compute nodes could be 'pure' compute nodes with only tasktracker processes. Might be a way to extend their market a bit. Not sure it would actually perform well until it was tried. On Oct 18, 2012, at 10:08 AM, "Pamecha, Abhishek" <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote: Yes, I had similar views from the netapp paper. My usecase is io heavy and that's why ( atleast IMO), when data set grows, a shared SAN begins to make less sense as opposed to DAS for MR type of jobs. As Lucas pointed out, sharing the same data with other apps is a great adv. w SAN. Thanks Abhishek i Sent from my iPad with iMstakes On Oct 18, 2012, at 6:59, "Michael Segel" <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote: I haven't played with a NetApp box, but the way it has been explained to me is that your SAN appears as if its direct attached storage. Its possible, based on drives and other hardware, plus it looks like they are focusing on read times only. I'd contact a NetApp rep for a better answer. Actually if you are looking at a higher density in terms of storage, going with a storage / compute cluster makes sense. On Oct 18, 2012, at 8:48 AM, Jitendra Kumar Singh <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote: Hi, In the NetApp whitepaper on SAN solution (link given by Kevin) it makes following statement. Can someone please elaborate (or give a link that explains) how 12-disk in SAN can give 2000 IOPS while if used as JBOD would give 600 IOPS? "The E2660 can deliver up to 2,000 IOPS from a 12-disk stripe (the bottleneck being the 12 disks). This headroom translates into better read times for those 64KB blocks. Twelve copies of 12 MapReduce jobs reading from 12 SATA disks can at best never exceed 12 x 50 IOPS, or 600 IOPS. The E2660 volume has five times the IOPS headroom, which translates into faster read times and high MapReduce throughput " Thanks and Regards, -- Jitendra Kumar Singh On Thu, Oct 18, 2012 at 6:02 PM, Luca Pireddu <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote: On 10/18/2012 02:21 AM, Pamecha, Abhishek wrote: Tom Do you mean you are using GPFS instead of HDFS? Also, if you can share, are you deploying it as DAS set up or a SAN? Thanks, Abhishek Though I don't think I'd buy a SAN for a new Hadoop cluster, we have a SAN and are using it *instead of HDFS* with a small/medium Hadoop MapReduce cluster (up to 100 nodes or so, depending on our need). We still use the local node disks for intermediate data (mapred local storage). Although this set-up does limit our possibility to scale to a large number of nodes, that's not a concern for us. On the plus, we gain the flexibility to be able to share our cluster with non-Hadoop users at our centre. Luca Pireddu CRS4 - Distributed Computing Group Loc. Pixina Manna Edificio 1 09010 Pula (CA), Italy Tel: +39 0709250452 This electronic message is intended to be for the use only of the named recipient, and may contain information that is confidential or privileged. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution or use of the contents of this message is strictly prohibited. If you have received this message in error or are not the named recipient, please notify us immediately by contacting the sender at the electronic mail address noted above, and delete and destroy all copies of this message. Thank you. This electronic message is intended to be for the use only of the named recipient, and may contain information that is confidential or privileged. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution or use of the contents of this message is strictly prohibited. If you have received this message in error or are not the named recipient, please notify us immediately by contacting the sender at the electronic mail address noted above, and delete and destroy all copies of this message. Thank you. +
Zhani Pellumbi 2012-10-18, 15:46
-
Re: HDFS using SANSteve Loughran 2012-10-19, 08:06
On 18 October 2012 16:46, Zhani Pellumbi <[EMAIL PROTECTED]> wrote:
> Yes, Isilon NAS runs HDFS natively- thus your nodes become "compute" > nodes, running only task tracker processes. > I read the NetApp paper, and this is fundamentally different architecture > though. > There are some obvious benefits , being able to scale out your storage > layer independently from your compute layer, also since Isilon contains a > large number of our datasets, it allows us to analyze that data in place > without ingesting it into a diff location. > Also because of Isilons OneFS filesystem, your name node is distributed > across the entire Isilon cluster. However isilons documentation is lacking > on this :( > We are currently in the early stages of testing this architecture, and > cannot accurately speak on the performance of one vs the other yet. > I wonder if anyone else is using Isilon to run HDFS and can add some more > details :) > > That's an interesting article -though I came out confused. Where it talks about "HDFS protocol", what I think it means is that you can plug in the EMC filestore into Hadoop as a new filesystem, with a new URI schema (as there is with hdfs:// , webhdfs:// s3n:// and others. I think so -though sometimes the drawings seem to blur things. The Hadoop Filesystem API is sub-posix, so very easy to implement a bridge for. The basic file:// schema works well with any distributed filesystem where you don't care about locality -presumably the SAN is there to handle that. I'm not going to criticise any of the paper because I don't have any experience of isilion and don't want to fault it. What I will say is this: I fear SAN failures. When it is up, it is up. And when it is down you may as well go home for the day as you won't see your files until the SAN vendor's support team comes round. I do not have any data on how often SAN failures happen in the field -I will merely point people at MSR-TR-2004-67 *TerraServer SAN-Cluster Architecture and Operations Experience* [Gray 2004] which look at the architecture, availability and failure modes of a multi-PB SAN at microsoft (from a different vendor, eight years ago, ...etc). see also: http://wiki.apache.org/hadoop/SPOF -steve +
Steve Loughran 2012-10-19, 08:06
-
RE: HDFS using SANPamecha, Abhishek 2012-10-19, 00:29
Check this out:
http://www.symantec.com/connect/articles/getting-hang-iops-v13#a12 May be this helps. I think their RAID configuration or striping is contributing to it. Just my guess! Thanks, Abhishek From: Jitendra Kumar Singh [mailto:[EMAIL PROTECTED]] Sent: Thursday, October 18, 2012 6:49 AM To: [EMAIL PROTECTED] Subject: Re: HDFS using SAN Hi, In the NetApp whitepaper on SAN solution (link given by Kevin) it makes following statement. Can someone please elaborate (or give a link that explains) how 12-disk in SAN can give 2000 IOPS while if used as JBOD would give 600 IOPS? "The E2660 can deliver up to 2,000 IOPS from a 12-disk stripe (the bottleneck being the 12 disks). This headroom translates into better read times for those 64KB blocks. Twelve copies of 12 MapReduce jobs reading from 12 SATA disks can at best never exceed 12 x 50 IOPS, or 600 IOPS. The E2660 volume has five times the IOPS headroom, which translates into faster read times and high MapReduce throughput " Thanks and Regards, -- Jitendra Kumar Singh On Thu, Oct 18, 2012 at 6:02 PM, Luca Pireddu <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote: On 10/18/2012 02:21 AM, Pamecha, Abhishek wrote: Tom Do you mean you are using GPFS instead of HDFS? Also, if you can share, are you deploying it as DAS set up or a SAN? Thanks, Abhishek Though I don't think I'd buy a SAN for a new Hadoop cluster, we have a SAN and are using it *instead of HDFS* with a small/medium Hadoop MapReduce cluster (up to 100 nodes or so, depending on our need). We still use the local node disks for intermediate data (mapred local storage). Although this set-up does limit our possibility to scale to a large number of nodes, that's not a concern for us. On the plus, we gain the flexibility to be able to share our cluster with non-Hadoop users at our centre. -- Luca Pireddu CRS4 - Distributed Computing Group Loc. Pixina Manna Edificio 1 09010 Pula (CA), Italy Tel: +39 0709250452 +
Pamecha, Abhishek 2012-10-19, 00:29
-
HDFS using SANPamecha, Abhishek 2012-10-16, 18:28
Hi
I have read scattered documentation across the net which mostly say HDFS doesn't go well with SAN being used to store data. While some say, it is an emerging trend. I would love to know if there have been any tests performed which hint on what aspects does a direct storage excels/falls behind a SAN. We are investigating whether a direct storage option is better than a SAN storage for a modest cluster with data in 100 TBs in steady state. The SAN of course can support order of magnitude more of iops we care about for now, but given it is a shared infrastructure and we may expand our data size, it may not be an advantage in the future. Another thing I am interested in: for MR jobs, where data locality is the key driver, how does that span out when using a SAN instead of direct storage? And of course on the subjective topics of availability and reliability on using a SAN for data storage in HDFS, I would love to receive your views. Thanks, Abhishek +
Pamecha, Abhishek 2012-10-16, 18:28
-
RE: HDFS using SANJeffrey Buell 2012-10-16, 21:24
It will be difficult to make a SAN work well for Hadoop, but not impossible. I have done direct comparisons (but not published them yet). Direct local storage is likely to have much more capacity and more total bandwidth. But you can do pretty well with a SAN if you stuff it with the highest-capacity disks and provide an independent 8 gb (FC) or 10 GbE connection for every host. Watch out for overall SAN bandwidth limits (which may well be much less than the sum of the capacity of the wires connected to it). There will definitely be a hard limit to how many hosts you connect to a single SAN. Scaling to larger clusters will require multiple SANs.
Locality is an issue. Even though each host has a direct physical access to all the data, a "remote" access in HDFS will still have to go over the network to the host that owns the data. "Local" access is fine with the constraints above. RAID is not good for Hadoop performance for both local and SAN storage, so you'll want to configure one LUN for each physical disk in the SAN. If you do have mirroring or RAID on the SAN, you may be tempted to use that to replace Hadoop replication. But while the data is protected, access to the data is lost if the datanode goes down. You can get around that by running the datanode in a VM which is stored on the SAN and using VMware HA to automatically restart the VM on another host in case of a failure. Hortonworks has demonstrated this use-case but this strategy is a bit bleeding-edge. Jeff From: Pamecha, Abhishek [mailto:[EMAIL PROTECTED]] Sent: Tuesday, October 16, 2012 11:28 AM To: [EMAIL PROTECTED] Subject: HDFS using SAN Hi I have read scattered documentation across the net which mostly say HDFS doesn't go well with SAN being used to store data. While some say, it is an emerging trend. I would love to know if there have been any tests performed which hint on what aspects does a direct storage excels/falls behind a SAN. We are investigating whether a direct storage option is better than a SAN storage for a modest cluster with data in 100 TBs in steady state. The SAN of course can support order of magnitude more of iops we care about for now, but given it is a shared infrastructure and we may expand our data size, it may not be an advantage in the future. Another thing I am interested in: for MR jobs, where data locality is the key driver, how does that span out when using a SAN instead of direct storage? And of course on the subjective topics of availability and reliability on using a SAN for data storage in HDFS, I would love to receive your views. Thanks, Abhishek +
Jeffrey Buell 2012-10-16, 21:24
-
Re: HDFS using SANlohit 2012-10-16, 22:26
Adding to this. Locality is very important for MapReduce applications. One
might not see much of a difference for small MapReduce jobs running on direct attached storage vs SAN, but when you cluster grows or you find jobs which are heavy on IO, you would see quite a bit of difference. One thing which is obviously is also cost difference. Argument for that has been that SAN storage is much more reliable so you do not need default of 3 way replication factor you would do on direct attached storage. 2012/10/16 Jeffrey Buell <[EMAIL PROTECTED]> > It will be difficult to make a SAN work well for Hadoop, but not > impossible. I have done direct comparisons (but not published them yet). > Direct local storage is likely to have much more capacity and more total > bandwidth. But you can do pretty well with a SAN if you stuff it with the > highest-capacity disks and provide an independent 8 gb (FC) or 10 GbE > connection for every host. Watch out for overall SAN bandwidth limits > (which may well be much less than the sum of the capacity of the wires > connected to it). There will definitely be a hard limit to how many hosts > you connect to a single SAN. Scaling to larger clusters will require > multiple SANs.**** > > ** ** > > Locality is an issue. Even though each host has a direct physical access > to all the data, a “remote” access in HDFS will still have to go over the > network to the host that owns the data. “Local” access is fine with the > constraints above.**** > > ** ** > > RAID is not good for Hadoop performance for both local and SAN storage, so > you’ll want to configure one LUN for each physical disk in the SAN. If you > do have mirroring or RAID on the SAN, you may be tempted to use that to > replace Hadoop replication. But while the data is protected, access to the > data is lost if the datanode goes down. You can get around that by running > the datanode in a VM which is stored on the SAN and using VMware HA to > automatically restart the VM on another host in case of a failure. > Hortonworks has demonstrated this use-case but this strategy is a bit > bleeding-edge.**** > > ** ** > > Jeff**** > > ** ** > > *From:* Pamecha, Abhishek [mailto:[EMAIL PROTECTED]] > *Sent:* Tuesday, October 16, 2012 11:28 AM > *To:* [EMAIL PROTECTED] > *Subject:* HDFS using SAN**** > > ** ** > > Hi **** > > ** ** > > I have read scattered documentation across the net which mostly say HDFS > doesn't go well with SAN being used to store data. While some say, it is an > emerging trend. I would love to know if there have been any tests performed > which hint on what aspects does a direct storage excels/falls behind a SAN. > **** > > ** ** > > We are investigating whether a direct storage option is better than a SAN > storage for a modest cluster with data in 100 TBs in steady state. The SAN > of course can support order of magnitude more of iops we care about for > now, but given it is a shared infrastructure and we may expand our data > size, it may not be an advantage in the future.**** > > ** ** > > Another thing I am interested in: for MR jobs, where data locality is the > key driver, how does that span out when using a SAN instead of direct > storage?**** > > ** ** > > And of course on the subjective topics of availability and reliability on > using a SAN for data storage in HDFS, I would love to receive your views.* > *** > > ** ** > > Thanks,**** > > Abhishek**** > > ** ** > -- Have a Nice Day! Lohit +
lohit 2012-10-16, 22:26
-
RE: HDFS using SANPamecha, Abhishek 2012-10-16, 23:28
Yes, for MR, my impression is typically the n/w utilization is next to none during map and reduce tasks but jumps during shuffle. With a SAN, I would assume there is no such separation. There will be network activity all over the job’s time window with shuffle probably doing more than what it should.
Moreover, I hear typically SANs by default, would split data in different physical disks [even w/o RAID], so contiguity is lost. But I have no idea on if that is a good thing or bad. Looks bad on the surface, but probably depends on how much parallelized data fetches from multiple physical disks can be done by a SAN efficiently. Any comments on this aspect? And yes, when the dataset volume increases and one needs to basically do full table scan equivalents, I am assuming the n/w needs to support that entire data move from SAN to the data node all in parallel to different mappers. So what I am gathering is although storing data over SAN is possible for a Hadoop installation, Map-shuffle-reduce may not be the best way to process data in that env. Is this conclusion correct? <3 way Replication and RAID suggestions are great. Thanks, Abhishek From: lohit [mailto:[EMAIL PROTECTED]] Sent: Tuesday, October 16, 2012 3:26 PM To: [EMAIL PROTECTED] Subject: Re: HDFS using SAN Adding to this. Locality is very important for MapReduce applications. One might not see much of a difference for small MapReduce jobs running on direct attached storage vs SAN, but when you cluster grows or you find jobs which are heavy on IO, you would see quite a bit of difference. One thing which is obviously is also cost difference. Argument for that has been that SAN storage is much more reliable so you do not need default of 3 way replication factor you would do on direct attached storage. 2012/10/16 Jeffrey Buell <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> It will be difficult to make a SAN work well for Hadoop, but not impossible. I have done direct comparisons (but not published them yet). Direct local storage is likely to have much more capacity and more total bandwidth. But you can do pretty well with a SAN if you stuff it with the highest-capacity disks and provide an independent 8 gb (FC) or 10 GbE connection for every host. Watch out for overall SAN bandwidth limits (which may well be much less than the sum of the capacity of the wires connected to it). There will definitely be a hard limit to how many hosts you connect to a single SAN. Scaling to larger clusters will require multiple SANs. Locality is an issue. Even though each host has a direct physical access to all the data, a “remote” access in HDFS will still have to go over the network to the host that owns the data. “Local” access is fine with the constraints above. RAID is not good for Hadoop performance for both local and SAN storage, so you’ll want to configure one LUN for each physical disk in the SAN. If you do have mirroring or RAID on the SAN, you may be tempted to use that to replace Hadoop replication. But while the data is protected, access to the data is lost if the datanode goes down. You can get around that by running the datanode in a VM which is stored on the SAN and using VMware HA to automatically restart the VM on another host in case of a failure. Hortonworks has demonstrated this use-case but this strategy is a bit bleeding-edge. Jeff From: Pamecha, Abhishek [mailto:[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>] Sent: Tuesday, October 16, 2012 11:28 AM To: [EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]> Subject: HDFS using SAN Hi I have read scattered documentation across the net which mostly say HDFS doesn't go well with SAN being used to store data. While some say, it is an emerging trend. I would love to know if there have been any tests performed which hint on what aspects does a direct storage excels/falls behind a SAN. We are investigating whether a direct storage option is better than a SAN storage for a modest cluster with data in 100 TBs in steady state. The SAN of course can support order of magnitude more of iops we care about for now, but given it is a shared infrastructure and we may expand our data size, it may not be an advantage in the future. Another thing I am interested in: for MR jobs, where data locality is the key driver, how does that span out when using a SAN instead of direct storage? And of course on the subjective topics of availability and reliability on using a SAN for data storage in HDFS, I would love to receive your views. Thanks, Abhishek Have a Nice Day! Lohit +
Pamecha, Abhishek 2012-10-16, 23:28
-
Re: HDFS using SANKevin O'dell 2012-10-17, 13:25
You may want to take a look at the Netapp White Paper on this. They have a
SAN solution as their Hadoop offering. http://www.netapp.com/templates/mediaView?m=tr-3969.pdf&cc=us&wid=130618138&mid=56872393 On Tue, Oct 16, 2012 at 7:28 PM, Pamecha, Abhishek <[EMAIL PROTECTED]> wrote: > Yes, for MR, my impression is typically the n/w utilization is next to > none during map and reduce tasks but jumps during shuffle. With a SAN, I > would assume there is no such separation. There will be network activity > all over the job’s time window with shuffle probably doing more than what > it should. **** > > ** ** > > Moreover, I hear typically SANs by default, would split data in different > physical disks [even w/o RAID], so contiguity is lost. But I have no idea > on if that is a good thing or bad. Looks bad on the surface, but probably > depends on how much parallelized data fetches from multiple physical disks > can be done by a SAN efficiently. Any comments on this aspect?**** > > ** ** > > And yes, when the dataset volume increases and one needs to basically do > full table scan equivalents, I am assuming the n/w needs to support that > entire data move from SAN to the data node all in parallel to different > mappers.**** > > ** ** > > So what I am gathering is although storing data over SAN is possible for > a Hadoop installation, Map-shuffle-reduce may not be the best way to > process data in that env. Is this conclusion correct? **** > > ** ** > > <3 way Replication and RAID suggestions are great. **** > > ** ** > > Thanks,**** > > Abhishek**** > > ** ** > > *From:* lohit [mailto:[EMAIL PROTECTED]] > *Sent:* Tuesday, October 16, 2012 3:26 PM > *To:* [EMAIL PROTECTED] > *Subject:* Re: HDFS using SAN**** > > ** ** > > Adding to this. Locality is very important for MapReduce applications. One > might not see much of a difference for small MapReduce jobs running on > direct attached storage vs SAN, but when you cluster grows or you find jobs > which are heavy on IO, you would see quite a bit of difference. One thing > which is obviously is also cost difference. Argument for that has been that > SAN storage is much more reliable so you do not need default of 3 way > replication factor you would do on direct attached storage. **** > > ** ** > > 2012/10/16 Jeffrey Buell <[EMAIL PROTECTED]>**** > > It will be difficult to make a SAN work well for Hadoop, but not > impossible. I have done direct comparisons (but not published them yet). > Direct local storage is likely to have much more capacity and more total > bandwidth. But you can do pretty well with a SAN if you stuff it with the > highest-capacity disks and provide an independent 8 gb (FC) or 10 GbE > connection for every host. Watch out for overall SAN bandwidth limits > (which may well be much less than the sum of the capacity of the wires > connected to it). There will definitely be a hard limit to how many hosts > you connect to a single SAN. Scaling to larger clusters will require > multiple SANs.**** > > **** > > Locality is an issue. Even though each host has a direct physical access > to all the data, a “remote” access in HDFS will still have to go over the > network to the host that owns the data. “Local” access is fine with the > constraints above.**** > > **** > > RAID is not good for Hadoop performance for both local and SAN storage, so > you’ll want to configure one LUN for each physical disk in the SAN. If you > do have mirroring or RAID on the SAN, you may be tempted to use that to > replace Hadoop replication. But while the data is protected, access to the > data is lost if the datanode goes down. You can get around that by running > the datanode in a VM which is stored on the SAN and using VMware HA to > automatically restart the VM on another host in case of a failure. > Hortonworks has demonstrated this use-case but this strategy is a bit > bleeding-edge.**** > > **** > > Jeff**** > > **** > > *From:* Pamecha, Abhishek [mailto:[EMAIL PROTECTED]] Kevin O'Dell Customer Operations Engineer, Cloudera +
Kevin O'dell 2012-10-17, 13:25
-
Re: HDFS using SANMohamed Riadh Trad 2012-10-17, 13:37
Sauvegarde tes données!
Le 17 oct. 2012 à 15:25, Kevin O'dell a écrit : > You may want to take a look at the Netapp White Paper on this. They have a SAN solution as their Hadoop offering. > > http://www.netapp.com/templates/mediaView?m=tr-3969.pdf&cc=us&wid=130618138&mid=56872393 > > On Tue, Oct 16, 2012 at 7:28 PM, Pamecha, Abhishek <[EMAIL PROTECTED]> wrote: > Yes, for MR, my impression is typically the n/w utilization is next to none during map and reduce tasks but jumps during shuffle. With a SAN, I would assume there is no such separation. There will be network activity all over the job’s time window with shuffle probably doing more than what it should. > > > > Moreover, I hear typically SANs by default, would split data in different physical disks [even w/o RAID], so contiguity is lost. But I have no idea on if that is a good thing or bad. Looks bad on the surface, but probably depends on how much parallelized data fetches from multiple physical disks can be done by a SAN efficiently. Any comments on this aspect? > > > > And yes, when the dataset volume increases and one needs to basically do full table scan equivalents, I am assuming the n/w needs to support that entire data move from SAN to the data node all in parallel to different mappers. > > > > So what I am gathering is although storing data over SAN is possible for a Hadoop installation, Map-shuffle-reduce may not be the best way to process data in that env. Is this conclusion correct? > > > > <3 way Replication and RAID suggestions are great. > > > > Thanks, > > Abhishek > > > > From: lohit [mailto:[EMAIL PROTECTED]] > Sent: Tuesday, October 16, 2012 3:26 PM > To: [EMAIL PROTECTED] > Subject: Re: HDFS using SAN > > > > Adding to this. Locality is very important for MapReduce applications. One might not see much of a difference for small MapReduce jobs running on direct attached storage vs SAN, but when you cluster grows or you find jobs which are heavy on IO, you would see quite a bit of difference. One thing which is obviously is also cost difference. Argument for that has been that SAN storage is much more reliable so you do not need default of 3 way replication factor you would do on direct attached storage. > > > > 2012/10/16 Jeffrey Buell <[EMAIL PROTECTED]> > > It will be difficult to make a SAN work well for Hadoop, but not impossible. I have done direct comparisons (but not published them yet). Direct local storage is likely to have much more capacity and more total bandwidth. But you can do pretty well with a SAN if you stuff it with the highest-capacity disks and provide an independent 8 gb (FC) or 10 GbE connection for every host. Watch out for overall SAN bandwidth limits (which may well be much less than the sum of the capacity of the wires connected to it). There will definitely be a hard limit to how many hosts you connect to a single SAN. Scaling to larger clusters will require multiple SANs. > > > > Locality is an issue. Even though each host has a direct physical access to all the data, a “remote” access in HDFS will still have to go over the network to the host that owns the data. “Local” access is fine with the constraints above. > > > > RAID is not good for Hadoop performance for both local and SAN storage, so you’ll want to configure one LUN for each physical disk in the SAN. If you do have mirroring or RAID on the SAN, you may be tempted to use that to replace Hadoop replication. But while the data is protected, access to the data is lost if the datanode goes down. You can get around that by running the datanode in a VM which is stored on the SAN and using VMware HA to automatically restart the VM on another host in case of a failure. Hortonworks has demonstrated this use-case but this strategy is a bit bleeding-edge. > > > > Jeff > > > > From: Pamecha, Abhishek [mailto:[EMAIL PROTECTED]] > Sent: Tuesday, October 16, 2012 11:28 AM > To: [EMAIL PROTECTED] > Subject: HDFS using SAN +
Mohamed Riadh Trad 2012-10-17, 13:37
-
RE: HDFS using SANPamecha, Abhishek 2012-10-18, 00:26
In a SAN? Would it be a concern if I am relying on HDFS to do the replication and using SAN only for dumb storage tier. In that case, the only difference is remote vs local access.
Reliability may be, actually, even better in a SAN coz I would assume any reasonable SAN would provide decent fault-tolerance when its controller(s) fail. Thanks, Abhishek From: Mohamed Riadh Trad [mailto:[EMAIL PROTECTED]] Sent: Wednesday, October 17, 2012 6:37 AM To: [EMAIL PROTECTED] Subject: Re: HDFS using SAN Sauvegarde tes données! Le 17 oct. 2012 à 15:25, Kevin O'dell a écrit : You may want to take a look at the Netapp White Paper on this. They have a SAN solution as their Hadoop offering. http://www.netapp.com/templates/mediaView?m=tr-3969.pdf&cc=us&wid=130618138&mid=56872393 On Tue, Oct 16, 2012 at 7:28 PM, Pamecha, Abhishek <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote: Yes, for MR, my impression is typically the n/w utilization is next to none during map and reduce tasks but jumps during shuffle. With a SAN, I would assume there is no such separation. There will be network activity all over the job's time window with shuffle probably doing more than what it should. Moreover, I hear typically SANs by default, would split data in different physical disks [even w/o RAID], so contiguity is lost. But I have no idea on if that is a good thing or bad. Looks bad on the surface, but probably depends on how much parallelized data fetches from multiple physical disks can be done by a SAN efficiently. Any comments on this aspect? And yes, when the dataset volume increases and one needs to basically do full table scan equivalents, I am assuming the n/w needs to support that entire data move from SAN to the data node all in parallel to different mappers. So what I am gathering is although storing data over SAN is possible for a Hadoop installation, Map-shuffle-reduce may not be the best way to process data in that env. Is this conclusion correct? <3 way Replication and RAID suggestions are great. Thanks, Abhishek From: lohit [mailto:[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>] Sent: Tuesday, October 16, 2012 3:26 PM To: [EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]> Subject: Re: HDFS using SAN Adding to this. Locality is very important for MapReduce applications. One might not see much of a difference for small MapReduce jobs running on direct attached storage vs SAN, but when you cluster grows or you find jobs which are heavy on IO, you would see quite a bit of difference. One thing which is obviously is also cost difference. Argument for that has been that SAN storage is much more reliable so you do not need default of 3 way replication factor you would do on direct attached storage. 2012/10/16 Jeffrey Buell <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> It will be difficult to make a SAN work well for Hadoop, but not impossible. I have done direct comparisons (but not published them yet). Direct local storage is likely to have much more capacity and more total bandwidth. But you can do pretty well with a SAN if you stuff it with the highest-capacity disks and provide an independent 8 gb (FC) or 10 GbE connection for every host. Watch out for overall SAN bandwidth limits (which may well be much less than the sum of the capacity of the wires connected to it). There will definitely be a hard limit to how many hosts you connect to a single SAN. Scaling to larger clusters will require multiple SANs. Locality is an issue. Even though each host has a direct physical access to all the data, a "remote" access in HDFS will still have to go over the network to the host that owns the data. "Local" access is fine with the constraints above. RAID is not good for Hadoop performance for both local and SAN storage, so you'll want to configure one LUN for each physical disk in the SAN. If you do have mirroring or RAID on the SAN, you may be tempted to use that to replace Hadoop replication. But while the data is protected, access to the data is lost if the datanode goes down. You can get around that by running the datanode in a VM which is stored on the SAN and using VMware HA to automatically restart the VM on another host in case of a failure. Hortonworks has demonstrated this use-case but this strategy is a bit bleeding-edge. Jeff From: Pamecha, Abhishek [mailto:[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>] Sent: Tuesday, October 16, 2012 11:28 AM To: [EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]> Subject: HDFS using SAN Hi I have read scattered documentation across the net which mostly say HDFS doesn't go well with SAN being used to store data. While some say, it is an emerging trend. I would love to know if there have been any tests performed which hint on what aspects does a direct storage excels/falls behind a SAN. We are investigating whether a direct storage option is better than a SAN storage for a modest cluster with data in 100 TBs in steady state. The SAN of course can support order of magnitude more of iops we care about for now, but given it is a shared infrastructure and we may expand our data size, it may not be an advantage in the future. Another thing I am interested in: for MR jobs, where data locality is the key driver, how does that span out when using a SAN instead of direct storage? And of course on the subjective topics of availability and reliability on using a SAN for data storage in HDFS, I would love to receive your views. Thanks, Abhishek Have a Nice Day! Lohit Kevin O'Dell Customer Operations Engineer, Cloudera +
Pamecha, Abhishek 2012-10-18, 00:26
|