-Is it possible to set how many map slots to use on each job submission?
I often run into situations like this:
I am running a very heavy job(let's say job 1) on a hadoop cluster(which
takes many hours). Then something comes up that needs to be done very
quickly(let's say job 2).
Job 2 only takes a couple of hours when executed on hadoop. But it will
take a couple ten hours if run on a single machine.
So I'd definitely want to use Hadoop for job 2. But since job 1 is already
running on Hadoop and hogging all the map slots, I can't run job 2 on
hadoop(it will only be queued).
So I was wondering:
Is there a way to set a specific number of map slots(or the number of slave
nodes) to use when submitting each job?
I read that setNumMapTasks() is not an enforcing configuration.
I would like to leave a couple of map slots free for occasions like above.