|
Praveen Sripati
2011-06-14, 13:01
Arun C Murthy
2011-06-14, 19:29
Praveen Sripati
2011-06-15, 00:35
Mahadev Konar
2011-06-15, 03:29
Praveen Sripati
2011-06-15, 03:38
Mahadev Konar
2011-06-15, 03:47
Praveen Sripati
2011-06-15, 12:27
Josh Wills
2011-06-15, 16:21
Jeffrey Naisbitt
2011-06-16, 16:02
Luke Lu
2011-06-16, 18:53
|
-
Queries on MRv2Praveen Sripati 2011-06-14, 13:01
Hi,
I have gone through MapReduce NextGen Blog entries and JIRA and have the following queries >> There is a single API between the Scheduler and the ApplicationMaster: >> (List <Container> newContainers, List <ContainerStatus> containerStatuses) allocate (List <ResourceRequest> ask, List<Container> release) >> The AM ask for specific resources via a list of ResourceRequests (ask) and releases unnecessary Containers which were allocated by the Scheduler. >> The response contains a list of newly allocated Containers and the statuses of application-specific Containers that completed since the previous interaction between the AM and the RM. Q) If split-0 is is available in host1, host2 and host3, can ApplicationMaster request a scheduler for a container on host1 or host2 or host3? This way the scheduler can allocate the resources more effectively. Q) In a cluster there might be nodes of different capacities, how will the scheduler know that a particular node has 4 GB and another has 16 GB RAM before allocating the resources to the ApplicationMaster? Q) Are the unnecessary containers (List<Container> release) in the request released by the ApplicationMaster the ones rejected by the ApplicationMaster or those on which the map/reduce tasks have been completed? Q) What does the following in the response contain - "List <ContainerStatus> containerStatuses"? Q) Once the ApplicationMaster gets the list of the new containers from the Scheduler, what is the interaction between the ApplicationMaster and the Node Manager? Will the ApplicationMaster ask the Node Manager on the different nodes to launch/monitor the map/reduce tasks in those containers? Q) Does the Scheduler ask the Node Manager to create the containers on the different nodes? >> The resource requests are also aggregated by racks and then by the special any (*) for all containers. All resource requests are subject to change via the delta protocol. Q) Does (*) mean that the ApplicationMaster is OK with a container in any rack/host? This might be applicable for Reduce tasks. Thanks, Praveen +
Praveen Sripati 2011-06-14, 13:01
-
Re: Queries on MRv2Arun C Murthy 2011-06-14, 19:29
On Jun 14, 2011, at 6:31 PM, Praveen Sripati wrote: > Hi, > > I have gone through MapReduce NextGen Blog entries and JIRA and have > the > following queries > >>> There is a single API between the Scheduler and the >>> ApplicationMaster: > >>> (List <Container> newContainers, List <ContainerStatus> > containerStatuses) allocate (List <ResourceRequest> ask, > List<Container> > release) > >>> The AM ask for specific resources via a list of ResourceRequests >>> (ask) > and releases unnecessary Containers which were allocated by the > Scheduler. > >>> The response contains a list of newly allocated Containers and the > statuses of application-specific Containers that completed since the > previous interaction between the AM and the RM. > > Q) If split-0 is is available in host1, host2 and host3, can > ApplicationMaster request a scheduler for a container on host1 or > host2 or > host3? This way the scheduler can allocate the resources more > effectively. > Yes, absolutely. > Q) In a cluster there might be nodes of different capacities, how > will the > scheduler know that a particular node has 4 GB and another has 16 GB > RAM > before allocating the resources to the ApplicationMaster? > The NodeManager informs the RM about its capabilities on registration. The RM allocates appropriate resources to the AM(s). > Q) Are the unnecessary containers (List<Container> release) in the > request > released by the ApplicationMaster the ones rejected by the > ApplicationMaster > or those on which the map/reduce tasks have been completed? > Only unused ones. > Q) What does the following in the response contain - "List > <ContainerStatus> > containerStatuses"? > Status for completed completed containers. > Q) Once the ApplicationMaster gets the list of the new containers > from the > Scheduler, what is the interaction between the ApplicationMaster and > the > Node Manager? Will the ApplicationMaster ask the Node Manager on the > different nodes to launch/monitor the map/reduce tasks in those > containers? > No, the AM directly monitors the containers via an application- specific protocol. For MR applications we use TaskUmbilicalProtocol. The NM just monitors the unix process and informs the RM on exit of the unix process. > Q) Does the Scheduler ask the Node Manager to create the containers > on the > different nodes? No, the Scheduler allocates them to the respective AMs who then launch the container by talking to the NM. The NM can securely verify the authenticity of the 'container launch' request, including the resources allocated to the container. > >>> The resource requests are also aggregated by racks and then by the > special any (*) for all containers. All resource requests are > subject to > change via the delta protocol. > > Q) Does (*) mean that the ApplicationMaster is OK with a container > in any > rack/host? This might be applicable for Reduce tasks. > Yes. Hope this helps. Arun +
Arun C Murthy 2011-06-14, 19:29
-
Re: Queries on MRv2Praveen Sripati 2011-06-15, 00:35
Arun,
Thanks for the reply. Q) What happens if an ApplicationMaster asks a NM to launch a container and then releases the container in the allocate call later? Q) So, the NM watches the UNIX Process/Containers and sends the status to the ApplicationManager. Later the ApplicationManager sends the status of the containers in response to the allocate call to the ApplicationMaster. Why should the ApplicationMaster be aware of the container status, since it's already tracking the map/reduce tasks in the containers? Q) Does the ApplicationMaster notify the NodeManager to exit the UNIX Process when the map/reduce tasks in that particular container are completed? Are the containers re-used? Q) The ApplicationManager asks the NodeManager to create a container and also launch the map/reduce task in it. From then on the ApplicationManager and Map/Reduce tasks interact directly without the NodeManager. Am I correct? Praveen On Wed, Jun 15, 2011 at 12:59 AM, Arun C Murthy <[EMAIL PROTECTED]> wrote: > > On Jun 14, 2011, at 6:31 PM, Praveen Sripati wrote: > > Hi, >> >> I have gone through MapReduce NextGen Blog entries and JIRA and have the >> following queries >> >> There is a single API between the Scheduler and the ApplicationMaster: >>>> >>> >> (List <Container> newContainers, List <ContainerStatus> >>>> >>> containerStatuses) allocate (List <ResourceRequest> ask, List<Container> >> release) >> >> The AM ask for specific resources via a list of ResourceRequests (ask) >>>> >>> and releases unnecessary Containers which were allocated by the >> Scheduler. >> >> The response contains a list of newly allocated Containers and the >>>> >>> statuses of application-specific Containers that completed since the >> previous interaction between the AM and the RM. >> >> Q) If split-0 is is available in host1, host2 and host3, can >> ApplicationMaster request a scheduler for a container on host1 or host2 or >> host3? This way the scheduler can allocate the resources more effectively. >> >> > Yes, absolutely. > > > Q) In a cluster there might be nodes of different capacities, how will the >> scheduler know that a particular node has 4 GB and another has 16 GB RAM >> before allocating the resources to the ApplicationMaster? >> >> > The NodeManager informs the RM about its capabilities on registration. The > RM allocates appropriate resources to the AM(s). > > > Q) Are the unnecessary containers (List<Container> release) in the request >> released by the ApplicationMaster the ones rejected by the >> ApplicationMaster >> or those on which the map/reduce tasks have been completed? >> >> > Only unused ones. > > > Q) What does the following in the response contain - "List >> <ContainerStatus> >> containerStatuses"? >> >> > Status for completed completed containers. > > > Q) Once the ApplicationMaster gets the list of the new containers from the >> Scheduler, what is the interaction between the ApplicationMaster and the >> Node Manager? Will the ApplicationMaster ask the Node Manager on the >> different nodes to launch/monitor the map/reduce tasks in those >> containers? >> >> > No, the AM directly monitors the containers via an application-specific > protocol. > > For MR applications we use TaskUmbilicalProtocol. > > The NM just monitors the unix process and informs the RM on exit of the > unix process. > > > Q) Does the Scheduler ask the Node Manager to create the containers on the >> different nodes? >> > > No, the Scheduler allocates them to the respective AMs who then launch the > container by talking to the NM. > > The NM can securely verify the authenticity of the 'container launch' > request, including the resources allocated to the container. > > > >> The resource requests are also aggregated by racks and then by the >>>> >>> special any (*) for all containers. All resource requests are subject to >> change via the delta protocol. >> >> Q) Does (*) mean that the ApplicationMaster is OK with a container in any >> rack/host? This might be applicable for Reduce tasks. +
Praveen Sripati 2011-06-15, 00:35
-
Re: Queries on MRv2Mahadev Konar 2011-06-15, 03:29
Praveen,
Answers in line: > > Q) What happens if an ApplicationMaster asks a NM to launch a container and > then releases the container in the allocate call later? The Application Master only releases the container once the container is done. > > Q) So, the NM watches the UNIX Process/Containers and sends the status to > the ApplicationManager. Later the ApplicationManager sends the status of the > containers in response to the allocate call to the ApplicationMaster. Why > should the ApplicationMaster be aware of the container status, since it's > already tracking the map/reduce tasks in the containers? Its just a way to notify the application master as soon as possible when the containers fail. This helps in speeding up the notification of failed containers else AM has to wait for discovering failures via timeouts. > > Q) Does the ApplicationMaster notify the NodeManager to exit the UNIX > Process when the map/reduce tasks in that particular container are > completed? Are the containers re-used? Yes it notifes the NM. Containers are not re used as of now. In future we do see the containers being re used but we'll need leases to do that. > > Q) The ApplicationManager asks the NodeManager to create a container and > also launch the map/reduce task in it. From then on the ApplicationManager > and Map/Reduce tasks interact directly without the NodeManager. Am I > correct? > I think you mean ApplicationMaster. Yes, the applicationmaster and map/reduce tasks talk directly without NM being involved. > Praveen > > On Wed, Jun 15, 2011 at 12:59 AM, Arun C Murthy <[EMAIL PROTECTED]> wrote: > >> >> On Jun 14, 2011, at 6:31 PM, Praveen Sripati wrote: >> >> Hi, >>> >>> I have gone through MapReduce NextGen Blog entries and JIRA and have the >>> following queries >>> >>> There is a single API between the Scheduler and the ApplicationMaster: >>>>> >>>> >>> (List <Container> newContainers, List <ContainerStatus> >>>>> >>>> containerStatuses) allocate (List <ResourceRequest> ask, List<Container> >>> release) >>> >>> The AM ask for specific resources via a list of ResourceRequests (ask) >>>>> >>>> and releases unnecessary Containers which were allocated by the >>> Scheduler. >>> >>> The response contains a list of newly allocated Containers and the >>>>> >>>> statuses of application-specific Containers that completed since the >>> previous interaction between the AM and the RM. >>> >>> Q) If split-0 is is available in host1, host2 and host3, can >>> ApplicationMaster request a scheduler for a container on host1 or host2 or >>> host3? This way the scheduler can allocate the resources more effectively. >>> >>> >> Yes, absolutely. >> >> >> Q) In a cluster there might be nodes of different capacities, how will the >>> scheduler know that a particular node has 4 GB and another has 16 GB RAM >>> before allocating the resources to the ApplicationMaster? >>> >>> >> The NodeManager informs the RM about its capabilities on registration. The >> RM allocates appropriate resources to the AM(s). >> >> >> Q) Are the unnecessary containers (List<Container> release) in the request >>> released by the ApplicationMaster the ones rejected by the >>> ApplicationMaster >>> or those on which the map/reduce tasks have been completed? >>> >>> >> Only unused ones. >> >> >> Q) What does the following in the response contain - "List >>> <ContainerStatus> >>> containerStatuses"? >>> >>> >> Status for completed completed containers. >> >> >> Q) Once the ApplicationMaster gets the list of the new containers from the >>> Scheduler, what is the interaction between the ApplicationMaster and the >>> Node Manager? Will the ApplicationMaster ask the Node Manager on the >>> different nodes to launch/monitor the map/reduce tasks in those >>> containers? >>> >>> >> No, the AM directly monitors the containers via an application-specific >> protocol. >> >> For MR applications we use TaskUmbilicalProtocol. >> >> The NM just monitors the unix process and informs the RM on exit of the thanks mahadev @mahadevkonar +
Mahadev Konar 2011-06-15, 03:29
-
Re: Queries on MRv2Praveen Sripati 2011-06-15, 03:38
Mahadev,
MapReduce ApplicationMaster might behave well, but what about custom ApplicationMasters for other models. > Q) What happens if an ApplicationMaster asks a NM to launch a container and > then releases the container in the allocate call later? > A) The Application Master only releases the container once the container is done. Thanks, Praveen On Wed, Jun 15, 2011 at 8:59 AM, Mahadev Konar <[EMAIL PROTECTED]> wrote: > Praveen, > Answers in line: > > > > > Q) What happens if an ApplicationMaster asks a NM to launch a container > and > > then releases the container in the allocate call later? > > The Application Master only releases the container once the container is > done. > > > > > Q) So, the NM watches the UNIX Process/Containers and sends the status to > > the ApplicationManager. Later the ApplicationManager sends the status of > the > > containers in response to the allocate call to the ApplicationMaster. Why > > should the ApplicationMaster be aware of the container status, since it's > > already tracking the map/reduce tasks in the containers? > > Its just a way to notify the application master as soon as possible > when the containers fail. > This helps in speeding up the notification of failed containers else > AM has to wait for discovering > failures via timeouts. > > > > > Q) Does the ApplicationMaster notify the NodeManager to exit the UNIX > > Process when the map/reduce tasks in that particular container are > > completed? Are the containers re-used? > > Yes it notifes the NM. > > Containers are not re used as of now. In future we do see the > containers being re used but we'll need leases to do that. > > > > > Q) The ApplicationManager asks the NodeManager to create a container and > > also launch the map/reduce task in it. From then on the > ApplicationManager > > and Map/Reduce tasks interact directly without the NodeManager. Am I > > correct? > > > I think you mean ApplicationMaster. Yes, the applicationmaster and > map/reduce tasks talk directly > without NM being involved. > > > Praveen > > > > On Wed, Jun 15, 2011 at 12:59 AM, Arun C Murthy <[EMAIL PROTECTED]> > wrote: > > > >> > >> On Jun 14, 2011, at 6:31 PM, Praveen Sripati wrote: > >> > >> Hi, > >>> > >>> I have gone through MapReduce NextGen Blog entries and JIRA and have > the > >>> following queries > >>> > >>> There is a single API between the Scheduler and the ApplicationMaster: > >>>>> > >>>> > >>> (List <Container> newContainers, List <ContainerStatus> > >>>>> > >>>> containerStatuses) allocate (List <ResourceRequest> ask, > List<Container> > >>> release) > >>> > >>> The AM ask for specific resources via a list of ResourceRequests (ask) > >>>>> > >>>> and releases unnecessary Containers which were allocated by the > >>> Scheduler. > >>> > >>> The response contains a list of newly allocated Containers and the > >>>>> > >>>> statuses of application-specific Containers that completed since the > >>> previous interaction between the AM and the RM. > >>> > >>> Q) If split-0 is is available in host1, host2 and host3, can > >>> ApplicationMaster request a scheduler for a container on host1 or host2 > or > >>> host3? This way the scheduler can allocate the resources more > effectively. > >>> > >>> > >> Yes, absolutely. > >> > >> > >> Q) In a cluster there might be nodes of different capacities, how will > the > >>> scheduler know that a particular node has 4 GB and another has 16 GB > RAM > >>> before allocating the resources to the ApplicationMaster? > >>> > >>> > >> The NodeManager informs the RM about its capabilities on registration. > The > >> RM allocates appropriate resources to the AM(s). > >> > >> > >> Q) Are the unnecessary containers (List<Container> release) in the > request > >>> released by the ApplicationMaster the ones rejected by the > >>> ApplicationMaster > >>> or those on which the map/reduce tasks have been completed? > >>> > >>> > >> Only unused ones. > >> > >> > >> Q) What does the following in the response contain - "List +
Praveen Sripati 2011-06-15, 03:38
-
Re: Queries on MRv2Mahadev Konar 2011-06-15, 03:47
Praveen,
In that case, if a just launched container is released, the NM will be notified via the RM that the container is not longer valid and the NM will go ahead and kill the container. On Tue, Jun 14, 2011 at 8:38 PM, Praveen Sripati <[EMAIL PROTECTED]> wrote: > Mahadev, > > MapReduce ApplicationMaster might behave well, but what about custom > ApplicationMasters for other models. > >> Q) What happens if an ApplicationMaster asks a NM to launch a container > and >> then releases the container in the allocate call later? > >> A) The Application Master only releases the container once the container > is done. > > Thanks, > Praveen > > On Wed, Jun 15, 2011 at 8:59 AM, Mahadev Konar <[EMAIL PROTECTED]> wrote: > >> Praveen, >> Answers in line: >> >> > >> > Q) What happens if an ApplicationMaster asks a NM to launch a container >> and >> > then releases the container in the allocate call later? >> >> The Application Master only releases the container once the container is >> done. >> >> > >> > Q) So, the NM watches the UNIX Process/Containers and sends the status to >> > the ApplicationManager. Later the ApplicationManager sends the status of >> the >> > containers in response to the allocate call to the ApplicationMaster. Why >> > should the ApplicationMaster be aware of the container status, since it's >> > already tracking the map/reduce tasks in the containers? >> >> Its just a way to notify the application master as soon as possible >> when the containers fail. >> This helps in speeding up the notification of failed containers else >> AM has to wait for discovering >> failures via timeouts. >> >> > >> > Q) Does the ApplicationMaster notify the NodeManager to exit the UNIX >> > Process when the map/reduce tasks in that particular container are >> > completed? Are the containers re-used? >> >> Yes it notifes the NM. >> >> Containers are not re used as of now. In future we do see the >> containers being re used but we'll need leases to do that. >> >> > >> > Q) The ApplicationManager asks the NodeManager to create a container and >> > also launch the map/reduce task in it. From then on the >> ApplicationManager >> > and Map/Reduce tasks interact directly without the NodeManager. Am I >> > correct? >> > >> I think you mean ApplicationMaster. Yes, the applicationmaster and >> map/reduce tasks talk directly >> without NM being involved. >> >> > Praveen >> > >> > On Wed, Jun 15, 2011 at 12:59 AM, Arun C Murthy <[EMAIL PROTECTED]> >> wrote: >> > >> >> >> >> On Jun 14, 2011, at 6:31 PM, Praveen Sripati wrote: >> >> >> >> Hi, >> >>> >> >>> I have gone through MapReduce NextGen Blog entries and JIRA and have >> the >> >>> following queries >> >>> >> >>> There is a single API between the Scheduler and the ApplicationMaster: >> >>>>> >> >>>> >> >>> (List <Container> newContainers, List <ContainerStatus> >> >>>>> >> >>>> containerStatuses) allocate (List <ResourceRequest> ask, >> List<Container> >> >>> release) >> >>> >> >>> The AM ask for specific resources via a list of ResourceRequests (ask) >> >>>>> >> >>>> and releases unnecessary Containers which were allocated by the >> >>> Scheduler. >> >>> >> >>> The response contains a list of newly allocated Containers and the >> >>>>> >> >>>> statuses of application-specific Containers that completed since the >> >>> previous interaction between the AM and the RM. >> >>> >> >>> Q) If split-0 is is available in host1, host2 and host3, can >> >>> ApplicationMaster request a scheduler for a container on host1 or host2 >> or >> >>> host3? This way the scheduler can allocate the resources more >> effectively. >> >>> >> >>> >> >> Yes, absolutely. >> >> >> >> >> >> Q) In a cluster there might be nodes of different capacities, how will >> the >> >>> scheduler know that a particular node has 4 GB and another has 16 GB >> RAM >> >>> before allocating the resources to the ApplicationMaster? >> >>> >> >>> >> >> The NodeManager informs the RM about its capabilities on registration. >> The thanks mahadev @mahadevkonar +
Mahadev Konar 2011-06-15, 03:47
-
Re: Queries on MRv2Praveen Sripati 2011-06-15, 12:27
Hi,
- How to specify that an ApplicationMaster use a particular version of the MapReduce library dynamically? - How does the ApplicationManager pick a node to run the ApplicationMaster? What resource considerations are taken if any while picking a particular node to run the ApplicationMaster? - Who observes the ResourceManager/ApplicationMaster/NodeManager for failures to be restarted later? From the blog entry it seems that the state of the ResourceManager is stored in the ZooKeeper and the state of the ApplicationManager is stored in the HDFS. - Looks like the containers are based on Linux cgroups. So, is the MRv2 limited only to the Linux boxes? Hope the design document from Arun will make me ask less queries in this forum :) Thanks, Praveen On Wed, Jun 15, 2011 at 9:17 AM, Mahadev Konar <[EMAIL PROTECTED]> wrote: > Praveen, > In that case, if a just launched container is released, the NM will > be notified via the RM that the container is not longer valid and the > NM will go ahead and kill the container. > > > On Tue, Jun 14, 2011 at 8:38 PM, Praveen Sripati > <[EMAIL PROTECTED]> wrote: > > Mahadev, > > > > MapReduce ApplicationMaster might behave well, but what about custom > > ApplicationMasters for other models. > > > >> Q) What happens if an ApplicationMaster asks a NM to launch a container > > and > >> then releases the container in the allocate call later? > > > >> A) The Application Master only releases the container once the container > > is done. > > > > Thanks, > > Praveen > > > > On Wed, Jun 15, 2011 at 8:59 AM, Mahadev Konar <[EMAIL PROTECTED]> > wrote: > > > >> Praveen, > >> Answers in line: > >> > >> > > >> > Q) What happens if an ApplicationMaster asks a NM to launch a > container > >> and > >> > then releases the container in the allocate call later? > >> > >> The Application Master only releases the container once the container is > >> done. > >> > >> > > >> > Q) So, the NM watches the UNIX Process/Containers and sends the status > to > >> > the ApplicationManager. Later the ApplicationManager sends the status > of > >> the > >> > containers in response to the allocate call to the ApplicationMaster. > Why > >> > should the ApplicationMaster be aware of the container status, since > it's > >> > already tracking the map/reduce tasks in the containers? > >> > >> Its just a way to notify the application master as soon as possible > >> when the containers fail. > >> This helps in speeding up the notification of failed containers else > >> AM has to wait for discovering > >> failures via timeouts. > >> > >> > > >> > Q) Does the ApplicationMaster notify the NodeManager to exit the UNIX > >> > Process when the map/reduce tasks in that particular container are > >> > completed? Are the containers re-used? > >> > >> Yes it notifes the NM. > >> > >> Containers are not re used as of now. In future we do see the > >> containers being re used but we'll need leases to do that. > >> > >> > > >> > Q) The ApplicationManager asks the NodeManager to create a container > and > >> > also launch the map/reduce task in it. From then on the > >> ApplicationManager > >> > and Map/Reduce tasks interact directly without the NodeManager. Am I > >> > correct? > >> > > >> I think you mean ApplicationMaster. Yes, the applicationmaster and > >> map/reduce tasks talk directly > >> without NM being involved. > >> > >> > Praveen > >> > > >> > On Wed, Jun 15, 2011 at 12:59 AM, Arun C Murthy <[EMAIL PROTECTED]> > >> wrote: > >> > > >> >> > >> >> On Jun 14, 2011, at 6:31 PM, Praveen Sripati wrote: > >> >> > >> >> Hi, > >> >>> > >> >>> I have gone through MapReduce NextGen Blog entries and JIRA and have > >> the > >> >>> following queries > >> >>> > >> >>> There is a single API between the Scheduler and the > ApplicationMaster: > >> >>>>> > >> >>>> > >> >>> (List <Container> newContainers, List <ContainerStatus> > >> >>>>> > >> >>>> containerStatuses) allocate (List <ResourceRequest> ask, > >> List<Container> > >> >>> release) +
Praveen Sripati 2011-06-15, 12:27
-
Re: Queries on MRv2Josh Wills 2011-06-15, 16:21
Hey Praveen,
I'm in the same boat as you re: getting started with the MR2 code. I have a couple of answers and a couple of followup questions for Arun et al. to keep in mind as they're writing a design doc. On Wed, Jun 15, 2011 at 5:27 AM, Praveen Sripati <[EMAIL PROTECTED]> wrote: > Hi, > > - How to specify that an ApplicationMaster use a particular version of the > MapReduce library dynamically? I don't totally grok the question-- doesn't the client-side code that configures the ApplicationMaster decide this? > > - How does the ApplicationManager pick a node to run the ApplicationMaster? > What resource considerations are taken if any while picking a particular > node to run the ApplicationMaster? There is an ApplicationsManager (note the extra 's') that is part of the functionality of the RM. See reference: http://developer.yahoo.com/blogs/hadoop/posts/2011/03/mapreduce-nextgen-scheduler/ > > - Who observes the ResourceManager/ApplicationMaster/NodeManager for > failures to be restarted later? From the blog entry it seems that the state > of the ResourceManager is stored in the ZooKeeper and the state of the > ApplicationManager is stored in the HDFS. So this is the classic problem of any such system-- who watches the watchmen? It seems like the client would be notified when an ApplicationManager failed by the ApplicationSManager (see above blog post again, it's actually a good blog post, it would be great to have a few more of them), the ResourceManager would know when a NodeManager failed, and it falls to an admin and/or an external monitoring system to detect ResourceManager failure and handle the restart. > > - Looks like the containers are based on Linux cgroups. So, is the MRv2 > limited only to the Linux boxes? Yeah, I bumped into this when I was doing a naive build + install on my Mac. Not that I see folks running alot of hadoop clusters on Macs, but it would be cool if the basic build/install just worked on every platform, even if it's just as simple as detecting the platform and skipping the build of the native container-executor stuff. (Note: I actually got the container-executor stuff to build by using the standard Mac tricks, but I'm not sure if it's worth checking in.) > > Hope the design document from Arun will make me ask less queries in this > forum :) > > Thanks, > Praveen > > On Wed, Jun 15, 2011 at 9:17 AM, Mahadev Konar <[EMAIL PROTECTED]> wrote: > >> Praveen, >> In that case, if a just launched container is released, the NM will >> be notified via the RM that the container is not longer valid and the >> NM will go ahead and kill the container. >> >> >> On Tue, Jun 14, 2011 at 8:38 PM, Praveen Sripati >> <[EMAIL PROTECTED]> wrote: >> > Mahadev, >> > >> > MapReduce ApplicationMaster might behave well, but what about custom >> > ApplicationMasters for other models. >> > >> >> Q) What happens if an ApplicationMaster asks a NM to launch a container >> > and >> >> then releases the container in the allocate call later? >> > >> >> A) The Application Master only releases the container once the container >> > is done. >> > >> > Thanks, >> > Praveen >> > >> > On Wed, Jun 15, 2011 at 8:59 AM, Mahadev Konar <[EMAIL PROTECTED]> >> wrote: >> > >> >> Praveen, >> >> Answers in line: >> >> >> >> > >> >> > Q) What happens if an ApplicationMaster asks a NM to launch a >> container >> >> and >> >> > then releases the container in the allocate call later? >> >> >> >> The Application Master only releases the container once the container is >> >> done. >> >> >> >> > >> >> > Q) So, the NM watches the UNIX Process/Containers and sends the status >> to >> >> > the ApplicationManager. Later the ApplicationManager sends the status >> of >> >> the >> >> > containers in response to the allocate call to the ApplicationMaster. >> Why >> >> > should the ApplicationMaster be aware of the container status, since >> it's >> >> > already tracking the map/reduce tasks in the containers? >> >> >> >> Its just a way to notify the application master as soon as possible +
Josh Wills 2011-06-15, 16:21
-
Re: Queries on MRv2Jeffrey Naisbitt 2011-06-16, 16:02
On 6/15/11 11:21 AM, "Josh Wills" <[EMAIL PROTECTED]> wrote:
> > Yeah, I bumped into this when I was doing a naive build + install on > my Mac. Not that I see folks running alot of hadoop clusters on Macs, > but it would be cool if the basic build/install just worked on every > platform, even if it's just as simple as detecting the platform and > skipping the build of the native container-executor stuff. (Note: I > actually got the container-executor stuff to build by using the > standard Mac tricks, but I'm not sure if it's worth checking in.) I would definitely be interested in things working on Macs. You don't have a guide for what you did, do you? -Jeff +
Jeffrey Naisbitt 2011-06-16, 16:02
-
Re: Queries on MRv2Luke Lu 2011-06-16, 18:53
You can use mvn clean install -P-cbuild on mac and everything should
pass for recent checkouts (since Tuesday). On Thu, Jun 16, 2011 at 9:02 AM, Jeffrey Naisbitt <[EMAIL PROTECTED]> wrote: > On 6/15/11 11:21 AM, "Josh Wills" <[EMAIL PROTECTED]> wrote: >> >> Yeah, I bumped into this when I was doing a naive build + install on >> my Mac. Not that I see folks running alot of hadoop clusters on Macs, >> but it would be cool if the basic build/install just worked on every >> platform, even if it's just as simple as detecting the platform and >> skipping the build of the native container-executor stuff. (Note: I >> actually got the container-executor stuff to build by using the >> standard Mac tricks, but I'm not sure if it's worth checking in.) > > I would definitely be interested in things working on Macs. You don't have > a guide for what you did, do you? > -Jeff > > +
Luke Lu 2011-06-16, 18:53
|