|
Rita
2012-05-11, 10:43
Harsh J
2012-05-11, 11:08
Michael Segel
2012-05-11, 11:26
Rita
2012-05-11, 11:52
Shi Yu
2012-05-11, 14:03
Harsh J
2012-05-11, 15:53
Robert Evans
2012-05-11, 15:58
Michael Segel
2012-05-11, 14:58
|
-
freeze a mapreduce jobRita 2012-05-11, 10:43
I have a rather large map reduce job which takes few days. I was wondering
if its possible for me to freeze the job or make the job less intensive. Is it possible to reduce the number of slots per host and then I can increase them overnight? tia -- --- Get your facts first, then you can distort them as you please.-- +
Rita 2012-05-11, 10:43
-
Re: freeze a mapreduce jobHarsh J 2012-05-11, 11:08
I do not know about the per-host slot control (that is most likely not
supported, or not yet anyway - and perhaps feels wrong to do), but the rest of the needs can be doable if you use schedulers and queues/pools. If you use FairScheduler (FS), ensure that this job always goes to a special pool and when you want to freeze the pool simply set the pool's maxMaps and maxReduces to 0. Likewise, control max simultaneous tasks as you wish, to constrict instead of freeze. When you make changes to the FairScheduler configs, you do not need to restart the JT, and you may simply wait a few seconds for FairScheduler to refresh its own configs. More on FS at http://hadoop.apache.org/common/docs/current/fair_scheduler.html If you use CapacityScheduler (CS), then I believe you can do this by again making sure the job goes to a specific queue, and when needed to freeze it, simply set the queue's maximum-capacity to 0 (percentage) or to constrict it, choose a lower, positive percentage value as you need. You can also refresh CS to pick up config changes by refreshing queues via mradmin. More on CS at http://hadoop.apache.org/common/docs/current/capacity_scheduler.html Either approach will not freeze/constrict the job immediately, but should certainly prevent it from progressing. Meaning, their existing running tasks during the time of changes made to scheduler config will continue to run till completion but further tasks scheduling from those jobs shall begin seeing effect of the changes made. P.s. A better solution would be to make your job not take as many days, somehow? :-) On Fri, May 11, 2012 at 4:13 PM, Rita <[EMAIL PROTECTED]> wrote: > I have a rather large map reduce job which takes few days. I was wondering > if its possible for me to freeze the job or make the job less intensive. Is > it possible to reduce the number of slots per host and then I can increase > them overnight? > > > tia > > -- > --- Get your facts first, then you can distort them as you please.-- -- Harsh J +
Harsh J 2012-05-11, 11:08
-
Re: freeze a mapreduce jobMichael Segel 2012-05-11, 11:26
Just a quick note...
If your task is currently occupying a slot, the only way to release the slot is to kill the specific task. If you are using FS, you can move the task to another queue and/or you can lower the job's priority which will cause new tasks to spawn slower than other jobs so you will eventually free up the cluster. There isn't a way to 'freeze' or stop a job mid state. Is the issue that the job has a large number of slots, or is it an issue of the individual tasks taking a long time to complete? If its the latter, you will probably want to go to a capacity scheduler over the fair scheduler. HTH -Mike On May 11, 2012, at 6:08 AM, Harsh J wrote: > I do not know about the per-host slot control (that is most likely not > supported, or not yet anyway - and perhaps feels wrong to do), but the > rest of the needs can be doable if you use schedulers and > queues/pools. > > If you use FairScheduler (FS), ensure that this job always goes to a > special pool and when you want to freeze the pool simply set the > pool's maxMaps and maxReduces to 0. Likewise, control max simultaneous > tasks as you wish, to constrict instead of freeze. When you make > changes to the FairScheduler configs, you do not need to restart the > JT, and you may simply wait a few seconds for FairScheduler to refresh > its own configs. > > More on FS at http://hadoop.apache.org/common/docs/current/fair_scheduler.html > > If you use CapacityScheduler (CS), then I believe you can do this by > again making sure the job goes to a specific queue, and when needed to > freeze it, simply set the queue's maximum-capacity to 0 (percentage) > or to constrict it, choose a lower, positive percentage value as you > need. You can also refresh CS to pick up config changes by refreshing > queues via mradmin. > > More on CS at http://hadoop.apache.org/common/docs/current/capacity_scheduler.html > > Either approach will not freeze/constrict the job immediately, but > should certainly prevent it from progressing. Meaning, their existing > running tasks during the time of changes made to scheduler config will > continue to run till completion but further tasks scheduling from > those jobs shall begin seeing effect of the changes made. > > P.s. A better solution would be to make your job not take as many > days, somehow? :-) > > On Fri, May 11, 2012 at 4:13 PM, Rita <[EMAIL PROTECTED]> wrote: >> I have a rather large map reduce job which takes few days. I was wondering >> if its possible for me to freeze the job or make the job less intensive. Is >> it possible to reduce the number of slots per host and then I can increase >> them overnight? >> >> >> tia >> >> -- >> --- Get your facts first, then you can distort them as you please.-- > > > > -- > Harsh J > +
Michael Segel 2012-05-11, 11:26
-
Re: freeze a mapreduce jobRita 2012-05-11, 11:52
thanks. I think I will investigate capacity scheduler.
On Fri, May 11, 2012 at 7:26 AM, Michael Segel <[EMAIL PROTECTED]>wrote: > Just a quick note... > > If your task is currently occupying a slot, the only way to release the > slot is to kill the specific task. > If you are using FS, you can move the task to another queue and/or you can > lower the job's priority which will cause new tasks to spawn slower than > other jobs so you will eventually free up the cluster. > > There isn't a way to 'freeze' or stop a job mid state. > > Is the issue that the job has a large number of slots, or is it an issue > of the individual tasks taking a long time to complete? > > If its the latter, you will probably want to go to a capacity scheduler > over the fair scheduler. > > HTH > > -Mike > > On May 11, 2012, at 6:08 AM, Harsh J wrote: > > > I do not know about the per-host slot control (that is most likely not > > supported, or not yet anyway - and perhaps feels wrong to do), but the > > rest of the needs can be doable if you use schedulers and > > queues/pools. > > > > If you use FairScheduler (FS), ensure that this job always goes to a > > special pool and when you want to freeze the pool simply set the > > pool's maxMaps and maxReduces to 0. Likewise, control max simultaneous > > tasks as you wish, to constrict instead of freeze. When you make > > changes to the FairScheduler configs, you do not need to restart the > > JT, and you may simply wait a few seconds for FairScheduler to refresh > > its own configs. > > > > More on FS at > http://hadoop.apache.org/common/docs/current/fair_scheduler.html > > > > If you use CapacityScheduler (CS), then I believe you can do this by > > again making sure the job goes to a specific queue, and when needed to > > freeze it, simply set the queue's maximum-capacity to 0 (percentage) > > or to constrict it, choose a lower, positive percentage value as you > > need. You can also refresh CS to pick up config changes by refreshing > > queues via mradmin. > > > > More on CS at > http://hadoop.apache.org/common/docs/current/capacity_scheduler.html > > > > Either approach will not freeze/constrict the job immediately, but > > should certainly prevent it from progressing. Meaning, their existing > > running tasks during the time of changes made to scheduler config will > > continue to run till completion but further tasks scheduling from > > those jobs shall begin seeing effect of the changes made. > > > > P.s. A better solution would be to make your job not take as many > > days, somehow? :-) > > > > On Fri, May 11, 2012 at 4:13 PM, Rita <[EMAIL PROTECTED]> wrote: > >> I have a rather large map reduce job which takes few days. I was > wondering > >> if its possible for me to freeze the job or make the job less > intensive. Is > >> it possible to reduce the number of slots per host and then I can > increase > >> them overnight? > >> > >> > >> tia > >> > >> -- > >> --- Get your facts first, then you can distort them as you please.-- > > > > > > > > -- > > Harsh J > > > > -- --- Get your facts first, then you can distort them as you please.-- +
Rita 2012-05-11, 11:52
-
Re: freeze a mapreduce jobShi Yu 2012-05-11, 14:03
Is there any risk to suppress a job too long in FS? I guess there are
some parameters to control the waiting time of a job (such as timeout ,etc.), for example, if a job is kept idle for more than 24 hours is there a configuration deciding kill/keep that job? Shi On 5/11/2012 6:52 AM, Rita wrote: > thanks. I think I will investigate capacity scheduler. > > > On Fri, May 11, 2012 at 7:26 AM, Michael Segel<[EMAIL PROTECTED]>wrote: > >> Just a quick note... >> >> If your task is currently occupying a slot, the only way to release the >> slot is to kill the specific task. >> If you are using FS, you can move the task to another queue and/or you can >> lower the job's priority which will cause new tasks to spawn slower than >> other jobs so you will eventually free up the cluster. >> >> There isn't a way to 'freeze' or stop a job mid state. >> >> Is the issue that the job has a large number of slots, or is it an issue >> of the individual tasks taking a long time to complete? >> >> If its the latter, you will probably want to go to a capacity scheduler >> over the fair scheduler. >> >> HTH >> >> -Mike >> >> On May 11, 2012, at 6:08 AM, Harsh J wrote: >> >>> I do not know about the per-host slot control (that is most likely not >>> supported, or not yet anyway - and perhaps feels wrong to do), but the >>> rest of the needs can be doable if you use schedulers and >>> queues/pools. >>> >>> If you use FairScheduler (FS), ensure that this job always goes to a >>> special pool and when you want to freeze the pool simply set the >>> pool's maxMaps and maxReduces to 0. Likewise, control max simultaneous >>> tasks as you wish, to constrict instead of freeze. When you make >>> changes to the FairScheduler configs, you do not need to restart the >>> JT, and you may simply wait a few seconds for FairScheduler to refresh >>> its own configs. >>> >>> More on FS at >> http://hadoop.apache.org/common/docs/current/fair_scheduler.html >>> If you use CapacityScheduler (CS), then I believe you can do this by >>> again making sure the job goes to a specific queue, and when needed to >>> freeze it, simply set the queue's maximum-capacity to 0 (percentage) >>> or to constrict it, choose a lower, positive percentage value as you >>> need. You can also refresh CS to pick up config changes by refreshing >>> queues via mradmin. >>> >>> More on CS at >> http://hadoop.apache.org/common/docs/current/capacity_scheduler.html >>> Either approach will not freeze/constrict the job immediately, but >>> should certainly prevent it from progressing. Meaning, their existing >>> running tasks during the time of changes made to scheduler config will >>> continue to run till completion but further tasks scheduling from >>> those jobs shall begin seeing effect of the changes made. >>> >>> P.s. A better solution would be to make your job not take as many >>> days, somehow? :-) >>> >>> On Fri, May 11, 2012 at 4:13 PM, Rita<[EMAIL PROTECTED]> wrote: >>>> I have a rather large map reduce job which takes few days. I was >> wondering >>>> if its possible for me to freeze the job or make the job less >> intensive. Is >>>> it possible to reduce the number of slots per host and then I can >> increase >>>> them overnight? >>>> >>>> >>>> tia >>>> >>>> -- >>>> --- Get your facts first, then you can distort them as you please.-- >>> >>> >>> -- >>> Harsh J >>> >> > +
Shi Yu 2012-05-11, 14:03
-
Re: freeze a mapreduce jobHarsh J 2012-05-11, 15:53
Am not aware of a job-level timeout or idle monitor.
On Fri, May 11, 2012 at 7:33 PM, Shi Yu <[EMAIL PROTECTED]> wrote: > Is there any risk to suppress a job too long in FS? I guess there are > some parameters to control the waiting time of a job (such as timeout > ,etc.), for example, if a job is kept idle for more than 24 hours is there > a configuration deciding kill/keep that job? > > Shi > > > On 5/11/2012 6:52 AM, Rita wrote: >> >> thanks. I think I will investigate capacity scheduler. >> >> >> On Fri, May 11, 2012 at 7:26 AM, Michael >> Segel<[EMAIL PROTECTED]>wrote: >> >>> Just a quick note... >>> >>> If your task is currently occupying a slot, the only way to release the >>> slot is to kill the specific task. >>> If you are using FS, you can move the task to another queue and/or you >>> can >>> lower the job's priority which will cause new tasks to spawn slower than >>> other jobs so you will eventually free up the cluster. >>> >>> There isn't a way to 'freeze' or stop a job mid state. >>> >>> Is the issue that the job has a large number of slots, or is it an issue >>> of the individual tasks taking a long time to complete? >>> >>> If its the latter, you will probably want to go to a capacity scheduler >>> over the fair scheduler. >>> >>> HTH >>> >>> -Mike >>> >>> On May 11, 2012, at 6:08 AM, Harsh J wrote: >>> >>>> I do not know about the per-host slot control (that is most likely not >>>> supported, or not yet anyway - and perhaps feels wrong to do), but the >>>> rest of the needs can be doable if you use schedulers and >>>> queues/pools. >>>> >>>> If you use FairScheduler (FS), ensure that this job always goes to a >>>> special pool and when you want to freeze the pool simply set the >>>> pool's maxMaps and maxReduces to 0. Likewise, control max simultaneous >>>> tasks as you wish, to constrict instead of freeze. When you make >>>> changes to the FairScheduler configs, you do not need to restart the >>>> JT, and you may simply wait a few seconds for FairScheduler to refresh >>>> its own configs. >>>> >>>> More on FS at >>> >>> http://hadoop.apache.org/common/docs/current/fair_scheduler.html >>>> >>>> If you use CapacityScheduler (CS), then I believe you can do this by >>>> again making sure the job goes to a specific queue, and when needed to >>>> freeze it, simply set the queue's maximum-capacity to 0 (percentage) >>>> or to constrict it, choose a lower, positive percentage value as you >>>> need. You can also refresh CS to pick up config changes by refreshing >>>> queues via mradmin. >>>> >>>> More on CS at >>> >>> http://hadoop.apache.org/common/docs/current/capacity_scheduler.html >>>> >>>> Either approach will not freeze/constrict the job immediately, but >>>> should certainly prevent it from progressing. Meaning, their existing >>>> running tasks during the time of changes made to scheduler config will >>>> continue to run till completion but further tasks scheduling from >>>> those jobs shall begin seeing effect of the changes made. >>>> >>>> P.s. A better solution would be to make your job not take as many >>>> days, somehow? :-) >>>> >>>> On Fri, May 11, 2012 at 4:13 PM, Rita<[EMAIL PROTECTED]> wrote: >>>>> >>>>> I have a rather large map reduce job which takes few days. I was >>> >>> wondering >>>>> >>>>> if its possible for me to freeze the job or make the job less >>> >>> intensive. Is >>>>> >>>>> it possible to reduce the number of slots per host and then I can >>> >>> increase >>>>> >>>>> them overnight? >>>>> >>>>> >>>>> tia >>>>> >>>>> -- >>>>> --- Get your facts first, then you can distort them as you please.-- >>>> >>>> >>>> >>>> -- >>>> Harsh J >>>> >>> >> > -- Harsh J +
Harsh J 2012-05-11, 15:53
-
Re: freeze a mapreduce jobRobert Evans 2012-05-11, 15:58
There is an idle timeout for map/reduce tasks. If a task makes no progress for 10 min (Default) the AM will kill it on 2.0 and the JT will kill it on 1.0. But I don't know of anything associated with a Job, other then in 0.23 is the AM does not heart beat back in for too long, I believe that the RM may kill it and retry, but I don't know for sure.
--Bobby Evans On 5/11/12 10:53 AM, "Harsh J" <[EMAIL PROTECTED]> wrote: Am not aware of a job-level timeout or idle monitor. On Fri, May 11, 2012 at 7:33 PM, Shi Yu <[EMAIL PROTECTED]> wrote: > Is there any risk to suppress a job too long in FS? I guess there are > some parameters to control the waiting time of a job (such as timeout > ,etc.), for example, if a job is kept idle for more than 24 hours is there > a configuration deciding kill/keep that job? > > Shi > > > On 5/11/2012 6:52 AM, Rita wrote: >> >> thanks. I think I will investigate capacity scheduler. >> >> >> On Fri, May 11, 2012 at 7:26 AM, Michael >> Segel<[EMAIL PROTECTED]>wrote: >> >>> Just a quick note... >>> >>> If your task is currently occupying a slot, the only way to release the >>> slot is to kill the specific task. >>> If you are using FS, you can move the task to another queue and/or you >>> can >>> lower the job's priority which will cause new tasks to spawn slower than >>> other jobs so you will eventually free up the cluster. >>> >>> There isn't a way to 'freeze' or stop a job mid state. >>> >>> Is the issue that the job has a large number of slots, or is it an issue >>> of the individual tasks taking a long time to complete? >>> >>> If its the latter, you will probably want to go to a capacity scheduler >>> over the fair scheduler. >>> >>> HTH >>> >>> -Mike >>> >>> On May 11, 2012, at 6:08 AM, Harsh J wrote: >>> >>>> I do not know about the per-host slot control (that is most likely not >>>> supported, or not yet anyway - and perhaps feels wrong to do), but the >>>> rest of the needs can be doable if you use schedulers and >>>> queues/pools. >>>> >>>> If you use FairScheduler (FS), ensure that this job always goes to a >>>> special pool and when you want to freeze the pool simply set the >>>> pool's maxMaps and maxReduces to 0. Likewise, control max simultaneous >>>> tasks as you wish, to constrict instead of freeze. When you make >>>> changes to the FairScheduler configs, you do not need to restart the >>>> JT, and you may simply wait a few seconds for FairScheduler to refresh >>>> its own configs. >>>> >>>> More on FS at >>> >>> http://hadoop.apache.org/common/docs/current/fair_scheduler.html >>>> >>>> If you use CapacityScheduler (CS), then I believe you can do this by >>>> again making sure the job goes to a specific queue, and when needed to >>>> freeze it, simply set the queue's maximum-capacity to 0 (percentage) >>>> or to constrict it, choose a lower, positive percentage value as you >>>> need. You can also refresh CS to pick up config changes by refreshing >>>> queues via mradmin. >>>> >>>> More on CS at >>> >>> http://hadoop.apache.org/common/docs/current/capacity_scheduler.html >>>> >>>> Either approach will not freeze/constrict the job immediately, but >>>> should certainly prevent it from progressing. Meaning, their existing >>>> running tasks during the time of changes made to scheduler config will >>>> continue to run till completion but further tasks scheduling from >>>> those jobs shall begin seeing effect of the changes made. >>>> >>>> P.s. A better solution would be to make your job not take as many >>>> days, somehow? :-) >>>> >>>> On Fri, May 11, 2012 at 4:13 PM, Rita<[EMAIL PROTECTED]> wrote: >>>>> >>>>> I have a rather large map reduce job which takes few days. I was >>> >>> wondering >>>>> >>>>> if its possible for me to freeze the job or make the job less >>> >>> intensive. Is >>>>> >>>>> it possible to reduce the number of slots per host and then I can >>> >>> increase >>>>> >>>>> them overnight? >>>>> >>>>> >>>>> tia >>>>> >>>>> -- >>>>> --- Get your facts first, then you can distort them as you please.-- Harsh J +
Robert Evans 2012-05-11, 15:58
-
Re: freeze a mapreduce jobMichael Segel 2012-05-11, 14:58
I haven't seen any.
Haven't really had to test that... On May 11, 2012, at 9:03 AM, Shi Yu wrote: > Is there any risk to suppress a job too long in FS? I guess there are some parameters to control the waiting time of a job (such as timeout ,etc.), for example, if a job is kept idle for more than 24 hours is there a configuration deciding kill/keep that job? > > Shi > > On 5/11/2012 6:52 AM, Rita wrote: >> thanks. I think I will investigate capacity scheduler. >> >> >> On Fri, May 11, 2012 at 7:26 AM, Michael Segel<[EMAIL PROTECTED]>wrote: >> >>> Just a quick note... >>> >>> If your task is currently occupying a slot, the only way to release the >>> slot is to kill the specific task. >>> If you are using FS, you can move the task to another queue and/or you can >>> lower the job's priority which will cause new tasks to spawn slower than >>> other jobs so you will eventually free up the cluster. >>> >>> There isn't a way to 'freeze' or stop a job mid state. >>> >>> Is the issue that the job has a large number of slots, or is it an issue >>> of the individual tasks taking a long time to complete? >>> >>> If its the latter, you will probably want to go to a capacity scheduler >>> over the fair scheduler. >>> >>> HTH >>> >>> -Mike >>> >>> On May 11, 2012, at 6:08 AM, Harsh J wrote: >>> >>>> I do not know about the per-host slot control (that is most likely not >>>> supported, or not yet anyway - and perhaps feels wrong to do), but the >>>> rest of the needs can be doable if you use schedulers and >>>> queues/pools. >>>> >>>> If you use FairScheduler (FS), ensure that this job always goes to a >>>> special pool and when you want to freeze the pool simply set the >>>> pool's maxMaps and maxReduces to 0. Likewise, control max simultaneous >>>> tasks as you wish, to constrict instead of freeze. When you make >>>> changes to the FairScheduler configs, you do not need to restart the >>>> JT, and you may simply wait a few seconds for FairScheduler to refresh >>>> its own configs. >>>> >>>> More on FS at >>> http://hadoop.apache.org/common/docs/current/fair_scheduler.html >>>> If you use CapacityScheduler (CS), then I believe you can do this by >>>> again making sure the job goes to a specific queue, and when needed to >>>> freeze it, simply set the queue's maximum-capacity to 0 (percentage) >>>> or to constrict it, choose a lower, positive percentage value as you >>>> need. You can also refresh CS to pick up config changes by refreshing >>>> queues via mradmin. >>>> >>>> More on CS at >>> http://hadoop.apache.org/common/docs/current/capacity_scheduler.html >>>> Either approach will not freeze/constrict the job immediately, but >>>> should certainly prevent it from progressing. Meaning, their existing >>>> running tasks during the time of changes made to scheduler config will >>>> continue to run till completion but further tasks scheduling from >>>> those jobs shall begin seeing effect of the changes made. >>>> >>>> P.s. A better solution would be to make your job not take as many >>>> days, somehow? :-) >>>> >>>> On Fri, May 11, 2012 at 4:13 PM, Rita<[EMAIL PROTECTED]> wrote: >>>>> I have a rather large map reduce job which takes few days. I was >>> wondering >>>>> if its possible for me to freeze the job or make the job less >>> intensive. Is >>>>> it possible to reduce the number of slots per host and then I can >>> increase >>>>> them overnight? >>>>> >>>>> >>>>> tia >>>>> >>>>> -- >>>>> --- Get your facts first, then you can distort them as you please.-- >>>> >>>> >>>> -- >>>> Harsh J >>>> >>> >> > > +
Michael Segel 2012-05-11, 14:58
|