Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!dali.cs.montana.edu!ogicse!intelhf!ichips!inews!hopi!bhoughto
From: bhoughto@hopi.intel.com (Blair P. Houghton)
Newsgroups: comp.unix.internals
Subject: Re: "Nice" or not to "nice" large jobs
Keywords: scheduling priority nice
Message-ID: <4281@inews.intel.com>
Date: 18 May 91 00:02:29 GMT
References: <3197@sparko.gwu.edu> <1991May16.140622.29266@alchemy.chem.utoronto.ca>
Sender: news@inews.intel.com
Organization: Intel Corp, Chandler, AZ
Lines: 65

In article <1991May16.140622.29266@alchemy.chem.utoronto.ca> system@alchemy.chem.utoronto.ca (System Admin (Mike Peterson)) writes:
>Our policy for nice/renice is:
>
>0  - "short" processes/jobs (compilations, or up to 5 minutes cpu time).
>1  - "medium" processes/jobs (up to 1 hour cpu time).
>2  - "long" processes/jobs (all other jobs).

That's all but affectless.

(The numbered equations below are from Chapter 5 of _The
Design and Implementation of the 4.3BSD UNIX(R) Operating
System_, by S. J. Leffler, et al, Addison-Wesley, 1989; there
are different schemes for other systems, but the principle
is usually verisimilar).

The p_nice (set by nice(1) or renice(8)) of a process
affects the priority (in 4.3BSD) as follows:

	p_usrpri = PUSER + (p_cpu/4) + 2*p_nice               (Eq. 5.1)

Since p_usrpri is usually on the order of 50-80 for
non-sleeping processes, you're directly twiddling only 0-4
units, or 0-8% of the priority.  The effects of I/O,
paging and swapping will easily swamp that.

p_nice has the secondary effect, however, of retarding the
decay of p_cpu over time:  p_cpu is incremented at each
clock-tick (1/60th or 1/100th of a second, depending on
the implementation) and decremented once each second
according to

	p_cpu = p_cpu * ( (2*load) / (2*load + 1) ) + p_nice  (Eq. 5.2)

where load is the average length of the run queue over
the past minute.

In the case of two competing, cpu-hogging processes, load
remains at approximately 2.0, meaning that (5.2)
becomes approximately

	p_cpu = 0.8 * p_cpu + p_nice

In order to have p_nice cancel just half of the decay, it
should be larger than 0.1 * p_cpu.  Since PUSER is often
50, and p_nice is often 0, then a p_usrpri of 50-80 gives a
p_cpu of 0-120, which for maximum coverage means you want a
p_nice of 12 or more; anything lower is not going to affect
scheduling more than half the time.  Most importantly,
since 0-120 is practically the entire range of 0-127 of
which p_cpu is capable, you're going to see the effects of
differences in p_nice of +-1 only when running jobs for a
very, very long time.  Certainly a process that does _any_
iterative IO is going to obviate the nice-value of a
competing process that does no IO until the end.

You've got a good scheme, but instead of { 0, 1, 2 } you 
should probably use { 4, 8, 12 }, and leave 0 for processes
with insignificant run-times (shell input, e.g.).

Much more effective, however, is to have all long jobs run
at nice value 0 when there are no users on the system and
let them thrash it out in the middle of the night.

				--Blair
				  "Your mileage may vary."