-Re: Hadoop-on-demand and torque
Ralph Castain 2012-05-21, 14:59
OMPI is a performance-focused community, so we always compare things :-)
Some initial data against YARN, but not Mesos. Someone has been looking at porting OMPI to Mesos, but it turns out that Mesos isn't a particularly friendly MPI platform (a couple of us have been trying to provide advice on how to overcome the obstacles). I'm not sure what his plans are for completing that work - we haven't heard from him for a few weeks.
In terms of YARN, the OMPI-based "HOD" solution launches an MPI program about 1000x faster, and runs about 10x faster. The launch time differences grows with scale as the YARN MPI solution wires up with a quadratic time signature, while the OMPI solution wires up logarithmically. The execution time difference depends upon the application (IO bound vs compute bound), but largely stems from a difference in available data transports.
As a practical example, running a simple MPI "ring" program takes about 90 seconds on an 8 node system using YARN, and about 35 milliseconds using OMPI under SLURM. An MR word count program that looked at 1000 files took about 6 minutes using YARN, and about 11 seconds using OMPI's MR+.
Non-MPI programs also tend to launch faster due to the difference in how YARN handles launch vs other RMs. Again, a non-MPI "hello" running on an 8 node system can still take 20 seconds to run, depending on the heartbeat setting, and about 25 milliseconds using SLURM. You don't get the wireup impact, of course, so the time difference remains fairly consistent with scale.
This is inline with what others have reported, so I think the results (although preliminary) are consistent with the findings reported elsewhere.
We'll have to wait to see about Mesos.
On May 21, 2012, at 8:45 AM, Charles Earl wrote:
> Do you have any YARN or Mesos performance comparison against HOD? I suppose since it was customer requirement you might not have explored it. MPI support seems to be active issue for Mesos now.
> On May 21, 2012, at 10:36 AM, Ralph Castain <[EMAIL PROTECTED]> wrote:
>> Not quite yet, though we are working on it (some descriptive stuff is around, but needs to be consolidated). Several of us started working together a couple of months ago to support the MapReduce programming model on HPC clusters using Open MPI as the platform. In working with our customers and OMPI's wide community of users, we found that people were interested in this capability, wanted to integrate MPI support into their MapReduce jobs, and didn't want to migrate their clusters to YARN for various reasons.
>> We have released initial versions of two new tools in the OMPI developer's trunk, scheduled for inclusion in the upcoming 1.7.0 release:
>> 1. "mr+" - executes the MapReduce programming paradigm. Currently, we only support streaming data, though we will extend that support shortly. All HPC environments (rsh, SLURM, Torque, Alps, LSF, Windows, etc.) are supported. Both mappers and reducers can utilize MPI (independently or in combination) if they so choose. Mappers and reducers can be written in any of the typical HPC languages (C, C++, and Fortran) as well as Java (note: OMPI now comes with Java MPI bindings).
>> 2. "hdfsalloc" - takes a list of files and obtains a resource allocation for the nodes upon which those files reside. SLURM and Moab/Maui are currently supported, with Gridengine coming soon.
>> There will be a public announcement of this in the near future, and we expect to integrate the Hadoop 1.0 and Hadoop 2.0 MR classes over the next couple of months. By the end of this summer, we should have a full-featured public release.
>> On May 20, 2012, at 2:10 PM, Brian Bockelman wrote:
>>> Hi Ralph,
>>> I admit - I've only been half-following the OpenMPI progress. Do you have a technical write-up of what has been done?
>>> On May 20, 2012, at 9:31 AM, Ralph Castain wrote:
>>>> FWIW: Open MPI now has an initial cut at "MR+" that runs map-reduce under any HPC environment. We don't have the Java integration yet to support the Hadoop MR class, but you can write a mapper/reducer and execute that programming paradigm. We plan to integrate the Hadoop MR class soon.