Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> HDFS using SAN

Copy link to this message
Re: HDFS using SAN

Agreed Luca, we do this to support existing customers that have requested
it and it works fine within obvious IO considerations. But not a
recommended way to do a green field deployment.

Tom Deutsch
Program Director
Information Management
Big Data Technologies
3565 Harbor Blvd
Costa Mesa, CA 92626-1420

Twitter: @thomasdeutsch
Data Management Blog: ibmdatamag.com/author/tdeutsch/
LinkedIn: http://www.linkedin.com/profile/view?id=833160
Quora: http://www.quora.com/Tom-Deutsch
Smarter Computing Blog:
IBM Big Data Hub Blog: http://www.ibmbigdatahub.com/blog/author/tom-deutsch

Big Data for Business Executives Group:
From: Luca Pireddu <[EMAIL PROTECTED]>
Date: 10/18/2012 05:33 AM
Subject: Re: HDFS using SAN

On 10/18/2012 02:21 AM, Pamecha, Abhishek wrote:
> Tom
> Do you mean you are using GPFS instead of HDFS? Also, if you can share,
> are you deploying it as DAS set up or a SAN?
> Thanks,
> Abhishek
Though I don't think I'd buy a SAN for a new Hadoop cluster, we have a
SAN and are using it *instead of HDFS* with a small/medium Hadoop
MapReduce cluster (up to 100 nodes or so, depending on our need).  We
still use the local node disks for intermediate data (mapred local
storage).  Although this set-up does limit our possibility to scale to a
large number of nodes, that's not a concern for us.  On the plus, we
gain the flexibility to be able to share our cluster with non-Hadoop
users at our centre.
Luca Pireddu
CRS4 - Distributed Computing Group
Loc. Pixina Manna Edificio 1
09010 Pula (CA), Italy
Tel: +39 0709250452