-Re: want to try HBase on a large cluster running Lustre - any advice?
Jacques 2011-12-06, 19:04
A few quick thoughts:
- We run with DDR IB as our primary interconnect. We use local disks,
however. Things work well.
- If you're going to use both Ethernet and IPoIB for access: In the past
there were issues when using different network adapters in HBase. For us,
hostnames map to the IB ip addresses. Then every Ethernet-only machine
that accesses these machines has a /32 static route to the Infiniband IP
via the paired Ethernet adapter for each node.
- If you update local filesystem to do sync, you'll also need to create
FSUtils support for the updated filesystem. (See
https://issues.apache.org/jira/browse/HBASE-4169 for example).
On Mon, Dec 5, 2011 at 2:04 PM, Taylor, Ronald C <[EMAIL PROTECTED]>wrote:
> Hello Lars,
> Thanks for your previous help. Got a new question for you. I now have the
> opportunity to try using Hadoop and HBase on a newly installed cluster
> here, at a nominal cost. A lot of compute power (480+ nodes, 16 cores per
> node going up to 32 by the end of FY12, 64 GB RAM per node, with a few fat
> nodes with 256GB). One local drive of 1TB per node, and a four petabyte
> Lustre file system. Hadoop jobs are already running on this new cluster, on
> terabyte size data sets.
> Here's the drawback: I cannot permanently store HBase tables on local
> disk. After a job finishes, the disks are reclaimed. So - if I want to
> build a continuously available data warehouse (basically for analytics
> runs, not for real-time web access by a large community at present - just
> me and other internal bioinformatics folk here at PNNL) I need to put the
> HBase tables on the Lustre file system.
> Now, all the nodes in this cluster have a very fast InfiniBand QDR network
> interconnect. I think it's something like 40 gigabits/sec, as compared to
> the 1 gigabit/sec that you might see in a run-of-the-mill Hadoop cluster.
> And I just read a couple white papers that say that if the network
> interconnect is good enough, the loss of data locality when you use Lustre
> with Hadoop is not such a bad thing. That is, I Googled and found several
> papers on HDFS vs Lustre. The latest one I found (2011) is a white paper
> from a company called Xyratex. Here's a quote from it:
> The use of clustered file systems as a backend for Hadoop storage has been
> studied previously. The performance
> of distributed file systems such as Lustre2 , Ceph3 , PVFS4 , and GPFS5
> with Hadoop has been compared to that
> of HDFS. Most of these investigations have shown that non-HDFS file
> systems perform more poorly than HDFS,
> although with various optimizations and tuning efforts, a clustered file
> system can reach parity with HDFS. However,
> a consistent limitation in the studies of HDFS and non-HDFS performance
> with Hadoop is that they used the network
> infrastructure to which Hadoop is limited, TCP/IP, typically over 1 GigE.
> In HPC environments, where much faster
> network interconnects are available, significantly better clustered file
> system performance with Hadoop is possible.
> Anyway, I am not principally worried about speed or efficiency right now -
> this cluster is big enough that even if I do not use it most efficiently,
> I'll still be doing better than with my very small current cluster, which
> has very limited RAM and antique processors.
> My question is: will HBase work at all on Lustre? That is, on pp. 52-54 of
> your O'Reilly HBase book, you say that
> "... you are not locked into HDFS because the "FileSystem" used by HBase
> has a pluggable architecture and can be used to replace HDFS with any other
> supported system. The possibilities are endless and waiting for the brave
> at heart." ... "You can select a different filesystem implementation by
> using a URI pattern, where the scheme (the part before the first ":",
> i.e., the colon) part of the URI identifies the driver to be used."
> We use HDFS by setting the URI to