|
|
Ralph Castain 2011-11-21, 23:35
Hi folks
I am a lead developer in the Open MPI community, mostly focused on integrating that package with various environments. Over the last few months, I've had a couple of people ask me about MPI support within Hadoop - i.e., they want to run MPI applications under the Hadoop umbrella. I've spent a little time studying Hadoop, and it would seem a good fit for such a capability.
I'm willing to do the integration work, but wanted to check first to see if (a) someone in the Hadoop community is already doing so, and (b) if you would be interested in seeing such a capability and willing to accept the code contribution?
Establishing MPI support requires the following steps:
1. wireup support. MPI processes need to exchange endpoint info (e.g., for TCP connections, IP address and port) so that each process knows how to connect to any other process in the application. This is typically done in a collective "modex" operation. There are several ways of doing it - if we proceed, I will outline those in a separate email to solicit your input on the most desirable approach to use.
2. binding support. One can achieve significant performance improvements by binding processes to specific cores, sockets, and/or NUMA regions (regardless of using MPI or not, but certainly important for MPI applications). This requires not only the binding code, but some logic to ensure that one doesn't "overload" specific resources.
3. process mapping. I haven't verified it yet, but I suspect that Hadoop provides each executing instance with an identifier that is unique within that job - e.g., we typically assign an integer "rank" that ranges from 0 to the number of instances being executed. This identifier is critical for MPI applications, and the relative placement of processes within a job often dictates overall performance. Thus, we would provide a mapping capability that allows users to specify patterns of process placement for their job - e.g., "place one process on each socket on every node".
I have written the code to implement the above support on a number of systems, and don't foresee major problems doing it for Hadoop (though I would welcome a chance to get a brief walk-thru the code from someone). Please let me know if this would be of interest to the Hadoop community.
Thanks Ralph Castain
+
Ralph Castain 2011-11-21, 23:35
Arun C Murthy 2011-11-21, 23:43
Hi Ralph, Welcome! We'd absolutely love to have OpenMPI integrated with Hadoop! In fact, there already has been a bunch of discussions running OpenMPI on what we call MR2 (aka YARN), documented here: https://issues.apache.org/jira/browse/MAPREDUCE-2911. YARN is our effort to re-imagine Hadoop MapReduce as a general purpose, distributed, data processing system to support MapReduce, MPI and other programming paradigms on the same Hadoop cluster. Would love to collaborate, should we discuss on that jira? thanks, Arun On Nov 21, 2011, at 3:35 PM, Ralph Castain wrote: > Hi folks > > I am a lead developer in the Open MPI community, mostly focused on integrating that package with various environments. Over the last few months, I've had a couple of people ask me about MPI support within Hadoop - i.e., they want to run MPI applications under the Hadoop umbrella. I've spent a little time studying Hadoop, and it would seem a good fit for such a capability. > > I'm willing to do the integration work, but wanted to check first to see if (a) someone in the Hadoop community is already doing so, and (b) if you would be interested in seeing such a capability and willing to accept the code contribution? > > Establishing MPI support requires the following steps: > > 1. wireup support. MPI processes need to exchange endpoint info (e.g., for TCP connections, IP address and port) so that each process knows how to connect to any other process in the application. This is typically done in a collective "modex" operation. There are several ways of doing it - if we proceed, I will outline those in a separate email to solicit your input on the most desirable approach to use. > > 2. binding support. One can achieve significant performance improvements by binding processes to specific cores, sockets, and/or NUMA regions (regardless of using MPI or not, but certainly important for MPI applications). This requires not only the binding code, but some logic to ensure that one doesn't "overload" specific resources. > > 3. process mapping. I haven't verified it yet, but I suspect that Hadoop provides each executing instance with an identifier that is unique within that job - e.g., we typically assign an integer "rank" that ranges from 0 to the number of instances being executed. This identifier is critical for MPI applications, and the relative placement of processes within a job often dictates overall performance. Thus, we would provide a mapping capability that allows users to specify patterns of process placement for their job - e.g., "place one process on each socket on every node". > > I have written the code to implement the above support on a number of systems, and don't foresee major problems doing it for Hadoop (though I would welcome a chance to get a brief walk-thru the code from someone). Please let me know if this would be of interest to the Hadoop community. > > Thanks > Ralph Castain > >
+
Arun C Murthy 2011-11-21, 23:43
Ralph Castain 2011-11-21, 23:47
On Nov 21, 2011, at 4:43 PM, Arun C Murthy wrote: > Hi Ralph, > > Welcome! > > We'd absolutely love to have OpenMPI integrated with Hadoop! > > In fact, there already has been a bunch of discussions running OpenMPI on what we call MR2 (aka YARN), documented here: https://issues.apache.org/jira/browse/MAPREDUCE-2911. > > YARN is our effort to re-imagine Hadoop MapReduce as a general purpose, distributed, data processing system to support MapReduce, MPI and other programming paradigms on the same Hadoop cluster. > > Would love to collaborate, should we discuss on that jira? Sure! I'll poke my nose over there...thanks! > > thanks, > Arun > > On Nov 21, 2011, at 3:35 PM, Ralph Castain wrote: > >> Hi folks >> >> I am a lead developer in the Open MPI community, mostly focused on integrating that package with various environments. Over the last few months, I've had a couple of people ask me about MPI support within Hadoop - i.e., they want to run MPI applications under the Hadoop umbrella. I've spent a little time studying Hadoop, and it would seem a good fit for such a capability. >> >> I'm willing to do the integration work, but wanted to check first to see if (a) someone in the Hadoop community is already doing so, and (b) if you would be interested in seeing such a capability and willing to accept the code contribution? >> >> Establishing MPI support requires the following steps: >> >> 1. wireup support. MPI processes need to exchange endpoint info (e.g., for TCP connections, IP address and port) so that each process knows how to connect to any other process in the application. This is typically done in a collective "modex" operation. There are several ways of doing it - if we proceed, I will outline those in a separate email to solicit your input on the most desirable approach to use. >> >> 2. binding support. One can achieve significant performance improvements by binding processes to specific cores, sockets, and/or NUMA regions (regardless of using MPI or not, but certainly important for MPI applications). This requires not only the binding code, but some logic to ensure that one doesn't "overload" specific resources. >> >> 3. process mapping. I haven't verified it yet, but I suspect that Hadoop provides each executing instance with an identifier that is unique within that job - e.g., we typically assign an integer "rank" that ranges from 0 to the number of instances being executed. This identifier is critical for MPI applications, and the relative placement of processes within a job often dictates overall performance. Thus, we would provide a mapping capability that allows users to specify patterns of process placement for their job - e.g., "place one process on each socket on every node". >> >> I have written the code to implement the above support on a number of systems, and don't foresee major problems doing it for Hadoop (though I would welcome a chance to get a brief walk-thru the code from someone). Please let me know if this would be of interest to the Hadoop community. >> >> Thanks >> Ralph Castain >> >> >
+
Ralph Castain 2011-11-21, 23:47
Milind.Bhandarkar@... 2011-11-21, 23:47
Hi Ralph,
I spoke with Jeff Squyres at SC11, and updated him on the status of my OpenMPI port on Hadoop Yarn.
To update everyone, I have OpenMPI examples running on #Yarn, although it requires some code cleanup and refactoring, however that can be done as a later step.
Currently, the MPI processes come up, get submitting client's IP and port via environment variables, connect to it, and do a barrier. The result of this barrier is that everyone in MPI_COMM_WORLD gets each other's endpoints.
I am aiming to submit the patch to hadoop by the end of this month.
I will publish the openmpi patch to github.
(As I mentioned to Jeff, OpenMPI requires a CCLA for accepting submissions. That will take some time.)
- Milind
--- Milind Bhandarkar Greenplum Labs, EMC (Disclaimer: Opinions expressed in this email are those of the author, and do not necessarily represent the views of any organization, past or present, the author might be affiliated with.)
> >I'm willing to do the integration work, but wanted to check first to see >if (a) someone in the Hadoop community is already doing so, and (b) if >you would be interested in seeing such a capability and willing to accept >the code contribution? > >Establishing MPI support requires the following steps: > >1. wireup support. MPI processes need to exchange endpoint info (e.g., >for TCP connections, IP address and port) so that each process knows how >to connect to any other process in the application. This is typically >done in a collective "modex" operation. There are several ways of doing >it - if we proceed, I will outline those in a separate email to solicit >your input on the most desirable approach to use. > >2. binding support. One can achieve significant performance improvements >by binding processes to specific cores, sockets, and/or NUMA regions >(regardless of using MPI or not, but certainly important for MPI >applications). This requires not only the binding code, but some logic to >ensure that one doesn't "overload" specific resources. > >3. process mapping. I haven't verified it yet, but I suspect that Hadoop >provides each executing instance with an identifier that is unique within >that job - e.g., we typically assign an integer "rank" that ranges from 0 >to the number of instances being executed. This identifier is critical >for MPI applications, and the relative placement of processes within a >job often dictates overall performance. Thus, we would provide a mapping >capability that allows users to specify patterns of process placement for >their job - e.g., "place one process on each socket on every node". > >I have written the code to implement the above support on a number of >systems, and don't foresee major problems doing it for Hadoop (though I >would welcome a chance to get a brief walk-thru the code from someone). >Please let me know if this would be of interest to the Hadoop community. > >Thanks >Ralph Castain > > >
+
Milind.Bhandarkar@... 2011-11-21, 23:47
Mahadev Konar 2011-11-21, 23:53
Milind, Great news. Any chance you can upload a patch as it is? I am sure, others can help cleaning it up. I am willing to help smoothen it out and am sure Ralph can provide feedback as well.
thanks mahadev
On Mon, Nov 21, 2011 at 3:47 PM, <[EMAIL PROTECTED]> wrote: > Hi Ralph, > > I spoke with Jeff Squyres at SC11, and updated him on the status of my > OpenMPI port on Hadoop Yarn. > > To update everyone, I have OpenMPI examples running on #Yarn, although it > requires some code cleanup and refactoring, however that can be done as a > later step. > > Currently, the MPI processes come up, get submitting client's IP and port > via environment variables, connect to it, and do a barrier. The result of > this barrier is that everyone in MPI_COMM_WORLD gets each other's > endpoints. > > I am aiming to submit the patch to hadoop by the end of this month. > > I will publish the openmpi patch to github. > > (As I mentioned to Jeff, OpenMPI requires a CCLA for accepting > submissions. That will take some time.) > > - Milind > > --- > Milind Bhandarkar > Greenplum Labs, EMC > (Disclaimer: Opinions expressed in this email are those of the author, and > do not necessarily represent the views of any organization, past or > present, the author might be affiliated with.) > > > >> >>I'm willing to do the integration work, but wanted to check first to see >>if (a) someone in the Hadoop community is already doing so, and (b) if >>you would be interested in seeing such a capability and willing to accept >>the code contribution? >> >>Establishing MPI support requires the following steps: >> >>1. wireup support. MPI processes need to exchange endpoint info (e.g., >>for TCP connections, IP address and port) so that each process knows how >>to connect to any other process in the application. This is typically >>done in a collective "modex" operation. There are several ways of doing >>it - if we proceed, I will outline those in a separate email to solicit >>your input on the most desirable approach to use. >> >>2. binding support. One can achieve significant performance improvements >>by binding processes to specific cores, sockets, and/or NUMA regions >>(regardless of using MPI or not, but certainly important for MPI >>applications). This requires not only the binding code, but some logic to >>ensure that one doesn't "overload" specific resources. >> >>3. process mapping. I haven't verified it yet, but I suspect that Hadoop >>provides each executing instance with an identifier that is unique within >>that job - e.g., we typically assign an integer "rank" that ranges from 0 >>to the number of instances being executed. This identifier is critical >>for MPI applications, and the relative placement of processes within a >>job often dictates overall performance. Thus, we would provide a mapping >>capability that allows users to specify patterns of process placement for >>their job - e.g., "place one process on each socket on every node". >> >>I have written the code to implement the above support on a number of >>systems, and don't foresee major problems doing it for Hadoop (though I >>would welcome a chance to get a brief walk-thru the code from someone). >>Please let me know if this would be of interest to the Hadoop community. >> >>Thanks >>Ralph Castain >> >> >> > >
+
Mahadev Konar 2011-11-21, 23:53
Ralph Castain 2011-11-21, 23:54
Hi Milind
Glad to hear of the progress - I recall our earlier conversation. I gather you have completed step 1 (wireup) - have you given any thought to the other two steps? Anything I can do to help?
Ralph On Nov 21, 2011, at 4:47 PM, <[EMAIL PROTECTED]> wrote:
> Hi Ralph, > > I spoke with Jeff Squyres at SC11, and updated him on the status of my > OpenMPI port on Hadoop Yarn. > > To update everyone, I have OpenMPI examples running on #Yarn, although it > requires some code cleanup and refactoring, however that can be done as a > later step. > > Currently, the MPI processes come up, get submitting client's IP and port > via environment variables, connect to it, and do a barrier. The result of > this barrier is that everyone in MPI_COMM_WORLD gets each other's > endpoints. > > I am aiming to submit the patch to hadoop by the end of this month. > > I will publish the openmpi patch to github. > > (As I mentioned to Jeff, OpenMPI requires a CCLA for accepting > submissions. That will take some time.) > > - Milind > > --- > Milind Bhandarkar > Greenplum Labs, EMC > (Disclaimer: Opinions expressed in this email are those of the author, and > do not necessarily represent the views of any organization, past or > present, the author might be affiliated with.) > > > >> >> I'm willing to do the integration work, but wanted to check first to see >> if (a) someone in the Hadoop community is already doing so, and (b) if >> you would be interested in seeing such a capability and willing to accept >> the code contribution? >> >> Establishing MPI support requires the following steps: >> >> 1. wireup support. MPI processes need to exchange endpoint info (e.g., >> for TCP connections, IP address and port) so that each process knows how >> to connect to any other process in the application. This is typically >> done in a collective "modex" operation. There are several ways of doing >> it - if we proceed, I will outline those in a separate email to solicit >> your input on the most desirable approach to use. >> >> 2. binding support. One can achieve significant performance improvements >> by binding processes to specific cores, sockets, and/or NUMA regions >> (regardless of using MPI or not, but certainly important for MPI >> applications). This requires not only the binding code, but some logic to >> ensure that one doesn't "overload" specific resources. >> >> 3. process mapping. I haven't verified it yet, but I suspect that Hadoop >> provides each executing instance with an identifier that is unique within >> that job - e.g., we typically assign an integer "rank" that ranges from 0 >> to the number of instances being executed. This identifier is critical >> for MPI applications, and the relative placement of processes within a >> job often dictates overall performance. Thus, we would provide a mapping >> capability that allows users to specify patterns of process placement for >> their job - e.g., "place one process on each socket on every node". >> >> I have written the code to implement the above support on a number of >> systems, and don't foresee major problems doing it for Hadoop (though I >> would welcome a chance to get a brief walk-thru the code from someone). >> Please let me know if this would be of interest to the Hadoop community. >> >> Thanks >> Ralph Castain >> >> >> >
+
Ralph Castain 2011-11-21, 23:54
Milind.Bhandarkar@... 2011-11-22, 00:04
Ralph,
Yes, I have completed the first step, although I would really like that code to be part of the MPI Application Master (Chris Douglas suggested a way to do this at ApacheCon).
Regarding the remaining steps, I have been following discussions on the open mpi mailing lists, and reading code for hwloc.
If you are making a trip to Cisco HQ sometime soon, I would like to have a face-to-face about hwloc. I have so far avoided to use a native task controller for spawning MPI jobs, but given the lack of support for binding in Java, it looks like I will have to bite the bullet.
- milind
--- Milind Bhandarkar Greenplum Labs, EMC (Disclaimer: Opinions expressed in this email are those of the author, and do not necessarily represent the views of any organization, past or present, the author might be affiliated with.)
On 11/21/11 3:54 PM, "Ralph Castain" <[EMAIL PROTECTED]> wrote:
>Hi Milind > >Glad to hear of the progress - I recall our earlier conversation. I >gather you have completed step 1 (wireup) - have you given any thought to >the other two steps? Anything I can do to help? > >Ralph > > >On Nov 21, 2011, at 4:47 PM, <[EMAIL PROTECTED]> wrote: > >> Hi Ralph, >> >> I spoke with Jeff Squyres at SC11, and updated him on the status of my >> OpenMPI port on Hadoop Yarn. >> >> To update everyone, I have OpenMPI examples running on #Yarn, although >>it >> requires some code cleanup and refactoring, however that can be done as >>a >> later step. >> >> Currently, the MPI processes come up, get submitting client's IP and >>port >> via environment variables, connect to it, and do a barrier. The result >>of >> this barrier is that everyone in MPI_COMM_WORLD gets each other's >> endpoints. >> >> I am aiming to submit the patch to hadoop by the end of this month. >> >> I will publish the openmpi patch to github. >> >> (As I mentioned to Jeff, OpenMPI requires a CCLA for accepting >> submissions. That will take some time.) >> >> - Milind >> >> --- >> Milind Bhandarkar >> Greenplum Labs, EMC >> (Disclaimer: Opinions expressed in this email are those of the author, >>and >> do not necessarily represent the views of any organization, past or >> present, the author might be affiliated with.) >> >> >> >>> >>> I'm willing to do the integration work, but wanted to check first to >>>see >>> if (a) someone in the Hadoop community is already doing so, and (b) if >>> you would be interested in seeing such a capability and willing to >>>accept >>> the code contribution? >>> >>> Establishing MPI support requires the following steps: >>> >>> 1. wireup support. MPI processes need to exchange endpoint info (e.g., >>> for TCP connections, IP address and port) so that each process knows >>>how >>> to connect to any other process in the application. This is typically >>> done in a collective "modex" operation. There are several ways of doing >>> it - if we proceed, I will outline those in a separate email to solicit >>> your input on the most desirable approach to use. >>> >>> 2. binding support. One can achieve significant performance >>>improvements >>> by binding processes to specific cores, sockets, and/or NUMA regions >>> (regardless of using MPI or not, but certainly important for MPI >>> applications). This requires not only the binding code, but some logic >>>to >>> ensure that one doesn't "overload" specific resources. >>> >>> 3. process mapping. I haven't verified it yet, but I suspect that >>>Hadoop >>> provides each executing instance with an identifier that is unique >>>within >>> that job - e.g., we typically assign an integer "rank" that ranges >>>from 0 >>> to the number of instances being executed. This identifier is critical >>> for MPI applications, and the relative placement of processes within a >>> job often dictates overall performance. Thus, we would provide a >>>mapping >>> capability that allows users to specify patterns of process placement >>>for >>> their job - e.g., "place one process on each socket on every node".
+
Milind.Bhandarkar@... 2011-11-22, 00:04
Ralph Castain 2011-11-22, 00:35
On Nov 21, 2011, at 5:04 PM, <[EMAIL PROTECTED]> wrote:
> Ralph, > > Yes, I have completed the first step, although I would really like that > code to be part of the MPI Application Master (Chris Douglas suggested a > way to do this at ApacheCon). > > Regarding the remaining steps, I have been following discussions on the > open mpi mailing lists, and reading code for hwloc. > > If you are making a trip to Cisco HQ sometime soon, I would like to have a > face-to-face about hwloc.
Not sure that looks likely right now - my project at Cisco is done, and it appears I'll be leaving the company soon.
> I have so far avoided to use a native task > controller for spawning MPI jobs, but given the lack of support for > binding in Java, it looks like I will have to bite the bullet.
I was actually looking at porting the binding support to Java as it looks feasible to do so, and I can understand not wanting to absorb all that configuration code to handle it in C. Given the loss of job, I have some free time on my hands while I search for employment, so I thought I might spend it looking at the Hadoop integration - since you have completed the wireup, I might look at this next.
> > - milind > > --- > Milind Bhandarkar > Greenplum Labs, EMC > (Disclaimer: Opinions expressed in this email are those of the author, and > do not necessarily represent the views of any organization, past or > present, the author might be affiliated with.) > > > > On 11/21/11 3:54 PM, "Ralph Castain" <[EMAIL PROTECTED]> wrote: > >> Hi Milind >> >> Glad to hear of the progress - I recall our earlier conversation. I >> gather you have completed step 1 (wireup) - have you given any thought to >> the other two steps? Anything I can do to help? >> >> Ralph >> >> >> On Nov 21, 2011, at 4:47 PM, <[EMAIL PROTECTED]> wrote: >> >>> Hi Ralph, >>> >>> I spoke with Jeff Squyres at SC11, and updated him on the status of my >>> OpenMPI port on Hadoop Yarn. >>> >>> To update everyone, I have OpenMPI examples running on #Yarn, although >>> it >>> requires some code cleanup and refactoring, however that can be done as >>> a >>> later step. >>> >>> Currently, the MPI processes come up, get submitting client's IP and >>> port >>> via environment variables, connect to it, and do a barrier. The result >>> of >>> this barrier is that everyone in MPI_COMM_WORLD gets each other's >>> endpoints. >>> >>> I am aiming to submit the patch to hadoop by the end of this month. >>> >>> I will publish the openmpi patch to github. >>> >>> (As I mentioned to Jeff, OpenMPI requires a CCLA for accepting >>> submissions. That will take some time.) >>> >>> - Milind >>> >>> --- >>> Milind Bhandarkar >>> Greenplum Labs, EMC >>> (Disclaimer: Opinions expressed in this email are those of the author, >>> and >>> do not necessarily represent the views of any organization, past or >>> present, the author might be affiliated with.) >>> >>> >>> >>>> >>>> I'm willing to do the integration work, but wanted to check first to >>>> see >>>> if (a) someone in the Hadoop community is already doing so, and (b) if >>>> you would be interested in seeing such a capability and willing to >>>> accept >>>> the code contribution? >>>> >>>> Establishing MPI support requires the following steps: >>>> >>>> 1. wireup support. MPI processes need to exchange endpoint info (e.g., >>>> for TCP connections, IP address and port) so that each process knows >>>> how >>>> to connect to any other process in the application. This is typically >>>> done in a collective "modex" operation. There are several ways of doing >>>> it - if we proceed, I will outline those in a separate email to solicit >>>> your input on the most desirable approach to use. >>>> >>>> 2. binding support. One can achieve significant performance >>>> improvements >>>> by binding processes to specific cores, sockets, and/or NUMA regions >>>> (regardless of using MPI or not, but certainly important for MPI >>>> applications). This requires not only the binding code, but some logic
+
Ralph Castain 2011-11-22, 00:35
Ralph Castain 2011-11-24, 01:13
FWIW: I can commit the OMPI part of your patch for you. The CCLA is intended to ensure that people realize the need to protect OMPI from "infection" due to code based on other licenses such as GPL. For people only offering a single patch, it often is too big a burden to get corporate approval of the legal document.
So as long as someone (e.g., me) who already is operating under the CCLA is willing to review and commit the patch, and the patch isn't too huge, we can absorb it that way. I expect your patch is just a new ess component, and I'm happy to do the review and commit it on your behalf, if that is acceptable to you. On Nov 21, 2011, at 5:04 PM, <[EMAIL PROTECTED]> <[EMAIL PROTECTED]> wrote:
> Ralph, > > Yes, I have completed the first step, although I would really like that > code to be part of the MPI Application Master (Chris Douglas suggested a > way to do this at ApacheCon). > > Regarding the remaining steps, I have been following discussions on the > open mpi mailing lists, and reading code for hwloc. > > If you are making a trip to Cisco HQ sometime soon, I would like to have a > face-to-face about hwloc. I have so far avoided to use a native task > controller for spawning MPI jobs, but given the lack of support for > binding in Java, it looks like I will have to bite the bullet. > > - milind > > --- > Milind Bhandarkar > Greenplum Labs, EMC > (Disclaimer: Opinions expressed in this email are those of the author, and > do not necessarily represent the views of any organization, past or > present, the author might be affiliated with.) > > > > On 11/21/11 3:54 PM, "Ralph Castain" <[EMAIL PROTECTED]> wrote: > >> Hi Milind >> >> Glad to hear of the progress - I recall our earlier conversation. I >> gather you have completed step 1 (wireup) - have you given any thought to >> the other two steps? Anything I can do to help? >> >> Ralph >> >> >> On Nov 21, 2011, at 4:47 PM, <[EMAIL PROTECTED]> wrote: >> >>> Hi Ralph, >>> >>> I spoke with Jeff Squyres at SC11, and updated him on the status of my >>> OpenMPI port on Hadoop Yarn. >>> >>> To update everyone, I have OpenMPI examples running on #Yarn, although >>> it >>> requires some code cleanup and refactoring, however that can be done as >>> a >>> later step. >>> >>> Currently, the MPI processes come up, get submitting client's IP and >>> port >>> via environment variables, connect to it, and do a barrier. The result >>> of >>> this barrier is that everyone in MPI_COMM_WORLD gets each other's >>> endpoints. >>> >>> I am aiming to submit the patch to hadoop by the end of this month. >>> >>> I will publish the openmpi patch to github. >>> >>> (As I mentioned to Jeff, OpenMPI requires a CCLA for accepting >>> submissions. That will take some time.) >>> >>> - Milind >>> >>> --- >>> Milind Bhandarkar >>> Greenplum Labs, EMC >>> (Disclaimer: Opinions expressed in this email are those of the author, >>> and >>> do not necessarily represent the views of any organization, past or >>> present, the author might be affiliated with.) >>> >>> >>> >>>> >>>> I'm willing to do the integration work, but wanted to check first to >>>> see >>>> if (a) someone in the Hadoop community is already doing so, and (b) if >>>> you would be interested in seeing such a capability and willing to >>>> accept >>>> the code contribution? >>>> >>>> Establishing MPI support requires the following steps: >>>> >>>> 1. wireup support. MPI processes need to exchange endpoint info (e.g., >>>> for TCP connections, IP address and port) so that each process knows >>>> how >>>> to connect to any other process in the application. This is typically >>>> done in a collective "modex" operation. There are several ways of doing >>>> it - if we proceed, I will outline those in a separate email to solicit >>>> your input on the most desirable approach to use. >>>> >>>> 2. binding support. One can achieve significant performance >>>> improvements >>>> by binding processes to specific cores, sockets, and/or NUMA regions
+
Ralph Castain 2011-11-24, 01:13
Arun Murthy 2011-11-24, 01:31
Awesome, thanks to both you guys! It's very exciting to see this progress!
Arun
Sent from my iPhone
On Nov 23, 2011, at 5:14 PM, Ralph Castain <[EMAIL PROTECTED]> wrote:
> FWIW: I can commit the OMPI part of your patch for you. The CCLA is intended to ensure that people realize the need to protect OMPI from "infection" due to code based on other licenses such as GPL. For people only offering a single patch, it often is too big a burden to get corporate approval of the legal document. > > So as long as someone (e.g., me) who already is operating under the CCLA is willing to review and commit the patch, and the patch isn't too huge, we can absorb it that way. I expect your patch is just a new ess component, and I'm happy to do the review and commit it on your behalf, if that is acceptable to you. > > > On Nov 21, 2011, at 5:04 PM, <[EMAIL PROTECTED]> <[EMAIL PROTECTED]> wrote: > >> Ralph, >> >> Yes, I have completed the first step, although I would really like that >> code to be part of the MPI Application Master (Chris Douglas suggested a >> way to do this at ApacheCon). >> >> Regarding the remaining steps, I have been following discussions on the >> open mpi mailing lists, and reading code for hwloc. >> >> If you are making a trip to Cisco HQ sometime soon, I would like to have a >> face-to-face about hwloc. I have so far avoided to use a native task >> controller for spawning MPI jobs, but given the lack of support for >> binding in Java, it looks like I will have to bite the bullet. >> >> - milind >> >> --- >> Milind Bhandarkar >> Greenplum Labs, EMC >> (Disclaimer: Opinions expressed in this email are those of the author, and >> do not necessarily represent the views of any organization, past or >> present, the author might be affiliated with.) >> >> >> >> On 11/21/11 3:54 PM, "Ralph Castain" <[EMAIL PROTECTED]> wrote: >> >>> Hi Milind >>> >>> Glad to hear of the progress - I recall our earlier conversation. I >>> gather you have completed step 1 (wireup) - have you given any thought to >>> the other two steps? Anything I can do to help? >>> >>> Ralph >>> >>> >>> On Nov 21, 2011, at 4:47 PM, <[EMAIL PROTECTED]> wrote: >>> >>>> Hi Ralph, >>>> >>>> I spoke with Jeff Squyres at SC11, and updated him on the status of my >>>> OpenMPI port on Hadoop Yarn. >>>> >>>> To update everyone, I have OpenMPI examples running on #Yarn, although >>>> it >>>> requires some code cleanup and refactoring, however that can be done as >>>> a >>>> later step. >>>> >>>> Currently, the MPI processes come up, get submitting client's IP and >>>> port >>>> via environment variables, connect to it, and do a barrier. The result >>>> of >>>> this barrier is that everyone in MPI_COMM_WORLD gets each other's >>>> endpoints. >>>> >>>> I am aiming to submit the patch to hadoop by the end of this month. >>>> >>>> I will publish the openmpi patch to github. >>>> >>>> (As I mentioned to Jeff, OpenMPI requires a CCLA for accepting >>>> submissions. That will take some time.) >>>> >>>> - Milind >>>> >>>> --- >>>> Milind Bhandarkar >>>> Greenplum Labs, EMC >>>> (Disclaimer: Opinions expressed in this email are those of the author, >>>> and >>>> do not necessarily represent the views of any organization, past or >>>> present, the author might be affiliated with.) >>>> >>>> >>>> >>>>> >>>>> I'm willing to do the integration work, but wanted to check first to >>>>> see >>>>> if (a) someone in the Hadoop community is already doing so, and (b) if >>>>> you would be interested in seeing such a capability and willing to >>>>> accept >>>>> the code contribution? >>>>> >>>>> Establishing MPI support requires the following steps: >>>>> >>>>> 1. wireup support. MPI processes need to exchange endpoint info (e.g., >>>>> for TCP connections, IP address and port) so that each process knows >>>>> how >>>>> to connect to any other process in the application. This is typically >>>>> done in a collective "modex" operation. There are several ways of doing >>
+
Arun Murthy 2011-11-24, 01:31
Milind.Bhandarkar@... 2011-11-28, 17:57
Great ! Works for me ! Thanks Ralph.
- Milind
--- Milind Bhandarkar Greenplum Labs, EMC (Disclaimer: Opinions expressed in this email are those of the author, and do not necessarily represent the views of any organization, past or present, the author might be affiliated with.)
On 11/23/11 5:13 PM, "Ralph Castain" <[EMAIL PROTECTED]> wrote:
>FWIW: I can commit the OMPI part of your patch for you. The CCLA is >intended to ensure that people realize the need to protect OMPI from >"infection" due to code based on other licenses such as GPL. For people >only offering a single patch, it often is too big a burden to get >corporate approval of the legal document. > >So as long as someone (e.g., me) who already is operating under the CCLA >is willing to review and commit the patch, and the patch isn't too huge, >we can absorb it that way. I expect your patch is just a new ess >component, and I'm happy to do the review and commit it on your behalf, >if that is acceptable to you. > > >On Nov 21, 2011, at 5:04 PM, <[EMAIL PROTECTED]> ><[EMAIL PROTECTED]> wrote: > >> Ralph, >> >> Yes, I have completed the first step, although I would really like that >> code to be part of the MPI Application Master (Chris Douglas suggested a >> way to do this at ApacheCon). >> >> Regarding the remaining steps, I have been following discussions on the >> open mpi mailing lists, and reading code for hwloc. >> >> If you are making a trip to Cisco HQ sometime soon, I would like to >>have a >> face-to-face about hwloc. I have so far avoided to use a native task >> controller for spawning MPI jobs, but given the lack of support for >> binding in Java, it looks like I will have to bite the bullet. >> >> - milind >> >> --- >> Milind Bhandarkar >> Greenplum Labs, EMC >> (Disclaimer: Opinions expressed in this email are those of the author, >>and >> do not necessarily represent the views of any organization, past or >> present, the author might be affiliated with.) >> >> >> >> On 11/21/11 3:54 PM, "Ralph Castain" <[EMAIL PROTECTED]> wrote: >> >>> Hi Milind >>> >>> Glad to hear of the progress - I recall our earlier conversation. I >>> gather you have completed step 1 (wireup) - have you given any thought >>>to >>> the other two steps? Anything I can do to help? >>> >>> Ralph >>> >>> >>> On Nov 21, 2011, at 4:47 PM, <[EMAIL PROTECTED]> wrote: >>> >>>> Hi Ralph, >>>> >>>> I spoke with Jeff Squyres at SC11, and updated him on the status of >>>>my >>>> OpenMPI port on Hadoop Yarn. >>>> >>>> To update everyone, I have OpenMPI examples running on #Yarn, although >>>> it >>>> requires some code cleanup and refactoring, however that can be done >>>>as >>>> a >>>> later step. >>>> >>>> Currently, the MPI processes come up, get submitting client's IP and >>>> port >>>> via environment variables, connect to it, and do a barrier. The result >>>> of >>>> this barrier is that everyone in MPI_COMM_WORLD gets each other's >>>> endpoints. >>>> >>>> I am aiming to submit the patch to hadoop by the end of this month. >>>> >>>> I will publish the openmpi patch to github. >>>> >>>> (As I mentioned to Jeff, OpenMPI requires a CCLA for accepting >>>> submissions. That will take some time.) >>>> >>>> - Milind >>>> >>>> --- >>>> Milind Bhandarkar >>>> Greenplum Labs, EMC >>>> (Disclaimer: Opinions expressed in this email are those of the author, >>>> and >>>> do not necessarily represent the views of any organization, past or >>>> present, the author might be affiliated with.) >>>> >>>> >>>> >>>>> >>>>> I'm willing to do the integration work, but wanted to check first to >>>>> see >>>>> if (a) someone in the Hadoop community is already doing so, and (b) >>>>>if >>>>> you would be interested in seeing such a capability and willing to >>>>> accept >>>>> the code contribution? >>>>> >>>>> Establishing MPI support requires the following steps: >>>>> >>>>> 1. wireup support. MPI processes need to exchange endpoint info >>>>>(e.g., >>>>> for TCP connections, IP address and port) so that each process knows
+
Milind.Bhandarkar@... 2011-11-28, 17:57
|
|