Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!dali.cs.montana.edu!ogicse!intelhf!ichips!inews!hopi!bhoughto From: bhoughto@hopi.intel.com (Blair P. Houghton) Newsgroups: comp.unix.internals Subject: Re: "Nice" or not to "nice" large jobs Keywords: scheduling priority nice Message-ID: <4281@inews.intel.com> Date: 18 May 91 00:02:29 GMT References: <3197@sparko.gwu.edu> <1991May16.140622.29266@alchemy.chem.utoronto.ca> Sender: news@inews.intel.com Organization: Intel Corp, Chandler, AZ Lines: 65 In article <1991May16.140622.29266@alchemy.chem.utoronto.ca> system@alchemy.chem.utoronto.ca (System Admin (Mike Peterson)) writes: >Our policy for nice/renice is: > >0 - "short" processes/jobs (compilations, or up to 5 minutes cpu time). >1 - "medium" processes/jobs (up to 1 hour cpu time). >2 - "long" processes/jobs (all other jobs). That's all but affectless. (The numbered equations below are from Chapter 5 of _The Design and Implementation of the 4.3BSD UNIX(R) Operating System_, by S. J. Leffler, et al, Addison-Wesley, 1989; there are different schemes for other systems, but the principle is usually verisimilar). The p_nice (set by nice(1) or renice(8)) of a process affects the priority (in 4.3BSD) as follows: p_usrpri = PUSER + (p_cpu/4) + 2*p_nice (Eq. 5.1) Since p_usrpri is usually on the order of 50-80 for non-sleeping processes, you're directly twiddling only 0-4 units, or 0-8% of the priority. The effects of I/O, paging and swapping will easily swamp that. p_nice has the secondary effect, however, of retarding the decay of p_cpu over time: p_cpu is incremented at each clock-tick (1/60th or 1/100th of a second, depending on the implementation) and decremented once each second according to p_cpu = p_cpu * ( (2*load) / (2*load + 1) ) + p_nice (Eq. 5.2) where load is the average length of the run queue over the past minute. In the case of two competing, cpu-hogging processes, load remains at approximately 2.0, meaning that (5.2) becomes approximately p_cpu = 0.8 * p_cpu + p_nice In order to have p_nice cancel just half of the decay, it should be larger than 0.1 * p_cpu. Since PUSER is often 50, and p_nice is often 0, then a p_usrpri of 50-80 gives a p_cpu of 0-120, which for maximum coverage means you want a p_nice of 12 or more; anything lower is not going to affect scheduling more than half the time. Most importantly, since 0-120 is practically the entire range of 0-127 of which p_cpu is capable, you're going to see the effects of differences in p_nice of +-1 only when running jobs for a very, very long time. Certainly a process that does _any_ iterative IO is going to obviate the nice-value of a competing process that does no IO until the end. You've got a good scheme, but instead of { 0, 1, 2 } you should probably use { 4, 8, 12 }, and leave 0 for processes with insignificant run-times (shell input, e.g.). Much more effective, however, is to have all long jobs run at nice value 0 when there are no users on the system and let them thrash it out in the middle of the night. --Blair "Your mileage may vary."