|
Luiz Carlos Muniz
2012-03-29, 00:25
Harsh J
2012-03-29, 04:46
Luiz Carlos Muniz
2012-03-29, 12:58
Samir Eljazovic
2012-03-29, 21:07
Radim Kolar
2012-04-03, 11:06
|
-
Send a map to all nodesLuiz Carlos Muniz 2012-03-29, 00:25
Hi,
Is there any way to ensure the execution of a map on all nodes of a clusterin a way that each node run the map once and only once. That is, I would use Hadoop to execute a method on all nodes in the cluster. Without the possibility that the method execute twice in the same node even if another node fails. I already set mapred.tasktracker.map.tasks.maximum to 1 and mapred.max.jobs.per.node to 1 but still, if a node fails, another node that has carried out a map before run the map again to meet the absence of which failed. Luiz Carlos Melo Muniz Luiz Carlos Melo Muniz
-
Re: Send a map to all nodesHarsh J 2012-03-29, 04:46
Luiz,
Though it is possible to 'hint' this by tweaking the InputSplits passed from the job, the default schedulers of Hadoop do not make any such guarantees and hence this isn't possible unless you write your own complete scheduler, an exercise that wouldn't suit production deployments unless you also test your scheduler intensively for other types of workloads. Why do you even need such a thing? For processing purposes or otherwise? I'm hoping its not a monitoring sort of hack you're trying to do. On Thu, Mar 29, 2012 at 5:55 AM, Luiz Carlos Muniz <[EMAIL PROTECTED]> wrote: > Hi, > > Is there any way to ensure the execution of a map on all nodes of a > clusterin a way that each node run the map once and only once. That is, I > would use Hadoop to execute a method on all nodes in the cluster. Without > the possibility that the method execute twice in the same node even if > another node fails. > > I already set mapred.tasktracker.map.tasks.maximum to 1 and > mapred.max.jobs.per.node to 1 but still, if a node fails, another node that > has > carried out a map before run the map again to meet the absence of which > failed. > > Luiz Carlos Melo Muniz > > Luiz Carlos Melo Muniz > > -- Harsh J
-
Re: Send a map to all nodesLuiz Carlos Muniz 2012-03-29, 12:58
Do not worry about this.
My problem is just run an algorithm on all nodes in a grid. So I realized, hadoop does not serve for this purpose and I am already studying a alternative. If you have some suggestion I will be grateful. Luiz Carlos Melo Muniz 2012/3/29 Harsh J <[EMAIL PROTECTED]> > Luiz, > > Though it is possible to 'hint' this by tweaking the InputSplits > passed from the job, the default schedulers of Hadoop do not make any > such guarantees and hence this isn't possible unless you write your > own complete scheduler, an exercise that wouldn't suit production > deployments unless you also test your scheduler intensively for other > types of workloads. > > Why do you even need such a thing? For processing purposes or > otherwise? I'm hoping its not a monitoring sort of hack you're trying > to do. > > On Thu, Mar 29, 2012 at 5:55 AM, Luiz Carlos Muniz <[EMAIL PROTECTED]> > wrote: > > Hi, > > > > Is there any way to ensure the execution of a map on all nodes of a > > clusterin a way that each node run the map once and only once. That is, I > > would use Hadoop to execute a method on all nodes in the cluster. Without > > the possibility that the method execute twice in the same node even if > > another node fails. > > > > I already set mapred.tasktracker.map.tasks.maximum to 1 and > > mapred.max.jobs.per.node to 1 but still, if a node fails, another node > that > > has > > carried out a map before run the map again to meet the absence of which > > failed. > > > > Luiz Carlos Melo Muniz > > > > Luiz Carlos Melo Muniz > > > > > > > > -- > Harsh J >
-
Re: Send a map to all nodesSamir Eljazovic 2012-03-29, 21:07
Hi Luiz,
you should consider Storm <https://github.com/nathanmarz/storm>or S4<http://incubator.apache.org/s4/>for your purpose. In Storm you can create topology to run your algorithm on all nodes. HTH Samir On 29 March 2012 14:58, Luiz Carlos Muniz <[EMAIL PROTECTED]> wrote: > Do not worry about this. > > My problem is just run an algorithm on all nodes in a grid. So I realized, > hadoop does not serve for this purpose and I am already studying a > alternative. If you have some suggestion I will be grateful. > > > Luiz Carlos Melo Muniz > > > > > > 2012/3/29 Harsh J <[EMAIL PROTECTED]> > >> Luiz, >> >> Though it is possible to 'hint' this by tweaking the InputSplits >> passed from the job, the default schedulers of Hadoop do not make any >> such guarantees and hence this isn't possible unless you write your >> own complete scheduler, an exercise that wouldn't suit production >> deployments unless you also test your scheduler intensively for other >> types of workloads. >> >> Why do you even need such a thing? For processing purposes or >> otherwise? I'm hoping its not a monitoring sort of hack you're trying >> to do. >> >> On Thu, Mar 29, 2012 at 5:55 AM, Luiz Carlos Muniz <[EMAIL PROTECTED]> >> wrote: >> > Hi, >> > >> > Is there any way to ensure the execution of a map on all nodes of a >> > clusterin a way that each node run the map once and only once. That is, >> I >> > would use Hadoop to execute a method on all nodes in the cluster. >> Without >> > the possibility that the method execute twice in the same node even if >> > another node fails. >> > >> > I already set mapred.tasktracker.map.tasks.maximum to 1 and >> > mapred.max.jobs.per.node to 1 but still, if a node fails, another node >> that >> > has >> > carried out a map before run the map again to meet the absence of which >> > failed. >> > >> > Luiz Carlos Melo Muniz >> > >> > Luiz Carlos Melo Muniz >> > >> > >> >> >> >> -- >> Harsh J >> > >
-
Re: Send a map to all nodesRadim Kolar 2012-04-03, 11:06
YARN in hadoop 0.23.1 can do this.
|