nice(1) only changes cpu scheduling priority. It doesn't really help
if you have tasks (and their child processes) that use too much
memory, which causes swapping, which is probably the real culprit to
cause servers to freeze. Decreasing kernel swappiness probably helps.
Another thing to try is ionice (on linux if you have reasonably recent
kernel with cfq as io scheduler, default for rhel5) if the freeze is
caused by io contention (assuming no swapping.)
You can write a simple script to periodically renice(1) and ionice(1)
these processes to see if they actually work for you.
On Tue, Nov 2, 2010 at 4:51 PM, Jinsong Hu <[EMAIL PROTECTED]> wrote:
> Hi: there:
> I have a cluster that is used for both hadoop mapreduce and hbase. What I
> found is that when I am running map/reduce jobs, the job can be very
> memory/cpu intensive, and cause hbase or data nodes to freeze. in hbase's
> case, the region server may shut it self down.
> In order to avoid this, I made very conservative configuration of the
> maximum number of mappers and reducers. However, I am wonder if hadoop
> allows me to start map/reduce with the command "nice" so that
> those jobs get lower priority than datanode/tasktracker/hbase regionserver.
> That way, if there is enough resource, the jobs can fully utilize them. but
> if not, those jobs will yield to other processes.