Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop, mail # user - freeze a mapreduce job


Copy link to this message
-
Re: freeze a mapreduce job
Michael Segel 2012-05-11, 14:58
I haven't seen any.

Haven't really had to test that...

On May 11, 2012, at 9:03 AM, Shi Yu wrote:

> Is there any risk to suppress a job too long in FS?    I guess there are some parameters to control the waiting time of a job (such as timeout ,etc.),   for example, if a job is kept idle for more than 24 hours is there a configuration deciding kill/keep that job?
>
> Shi
>
> On 5/11/2012 6:52 AM, Rita wrote:
>> thanks.  I think I will investigate capacity scheduler.
>>
>>
>> On Fri, May 11, 2012 at 7:26 AM, Michael Segel<[EMAIL PROTECTED]>wrote:
>>
>>> Just a quick note...
>>>
>>> If your task is currently occupying a slot,  the only way to release the
>>> slot is to kill the specific task.
>>> If you are using FS, you can move the task to another queue and/or you can
>>> lower the job's priority which will cause new tasks to spawn  slower than
>>> other jobs so you will eventually free up the cluster.
>>>
>>> There isn't a way to 'freeze' or stop a job mid state.
>>>
>>> Is the issue that the job has a large number of slots, or is it an issue
>>> of the individual tasks taking a  long time to complete?
>>>
>>> If its the latter, you will probably want to go to a capacity scheduler
>>> over the fair scheduler.
>>>
>>> HTH
>>>
>>> -Mike
>>>
>>> On May 11, 2012, at 6:08 AM, Harsh J wrote:
>>>
>>>> I do not know about the per-host slot control (that is most likely not
>>>> supported, or not yet anyway - and perhaps feels wrong to do), but the
>>>> rest of the needs can be doable if you use schedulers and
>>>> queues/pools.
>>>>
>>>> If you use FairScheduler (FS), ensure that this job always goes to a
>>>> special pool and when you want to freeze the pool simply set the
>>>> pool's maxMaps and maxReduces to 0. Likewise, control max simultaneous
>>>> tasks as you wish, to constrict instead of freeze. When you make
>>>> changes to the FairScheduler configs, you do not need to restart the
>>>> JT, and you may simply wait a few seconds for FairScheduler to refresh
>>>> its own configs.
>>>>
>>>> More on FS at
>>> http://hadoop.apache.org/common/docs/current/fair_scheduler.html
>>>> If you use CapacityScheduler (CS), then I believe you can do this by
>>>> again making sure the job goes to a specific queue, and when needed to
>>>> freeze it, simply set the queue's maximum-capacity to 0 (percentage)
>>>> or to constrict it, choose a lower, positive percentage value as you
>>>> need. You can also refresh CS to pick up config changes by refreshing
>>>> queues via mradmin.
>>>>
>>>> More on CS at
>>> http://hadoop.apache.org/common/docs/current/capacity_scheduler.html
>>>> Either approach will not freeze/constrict the job immediately, but
>>>> should certainly prevent it from progressing. Meaning, their existing
>>>> running tasks during the time of changes made to scheduler config will
>>>> continue to run till completion but further tasks scheduling from
>>>> those jobs shall begin seeing effect of the changes made.
>>>>
>>>> P.s. A better solution would be to make your job not take as many
>>>> days, somehow? :-)
>>>>
>>>> On Fri, May 11, 2012 at 4:13 PM, Rita<[EMAIL PROTECTED]>  wrote:
>>>>> I have a rather large map reduce job which takes few days. I was
>>> wondering
>>>>> if its possible for me to freeze the job or make the job less
>>> intensive. Is
>>>>> it possible to reduce the number of slots per host and then I can
>>> increase
>>>>> them overnight?
>>>>>
>>>>>
>>>>> tia
>>>>>
>>>>> --
>>>>> --- Get your facts first, then you can distort them as you please.--
>>>>
>>>>
>>>> --
>>>> Harsh J
>>>>
>>>
>>
>
>