Newsgroups: comp.unix.wizards
Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!zaphod.mps.ohio-state.edu!caen!hellgate.utah.edu!dog.ee.lbl.gov!elf.ee.lbl.gov!torek
From: torek@elf.ee.lbl.gov (Chris Torek)
Subject: Re: Load Avarage graph pattern
Organization: Lawrence Berkeley Laboratory, Berkeley
References: <2155@ccsg.tau.ac.il> <MYCROFT.91May31031208@goldman.gnu.ai.mit.edu> <MEISSNER.91May31111801@curley.osf.org>
Message-ID: <14081@dog.ee.lbl.gov>
X-Local-Date: Sat, 8 Jun 91 13:49:34 PDT
Reply-To: torek@elf.ee.lbl.gov (Chris Torek)
Date: Sat, 8 Jun 91 20:49:34 GMT

In article <2155@ccsg.tau.ac.il> shani@GENIUS.TAU.AC.IL (Oren Shani) asks:
>Can anyone tell me why the load avarge graph shows definite patterns
>of exponential decay?  It seems that most (by far most) of the points
>of the LA graph are on lines of the form c*exp(-a*(t-t0))+b, in which
>'a' is some cosmic constant ...

Surprisingly, I have seen no answers to this at all, when the reason is
trivial.  This exponential decay is there because it was designed to be
there.  The load average is computed by iterations of the formula:

				 -1/k		   -1/k
	average  =  average     e      +   n (1 - e    )
	       t	   t-1

where `t' is time, `n' is the instantaneous `number of runnable jobs',
and `k' is the number of discrete t's that occur per `load average time'.
Since the load average sample interval is 5 seconds, the one-minute
average has k=12 (12*5 = 60 seconds), the 5-minute average has k=60
(60*5 = 300 s = 5 min), and the 15-minute average has k=180 (180*5 =
900 s = 15 min).  When n is zero, as it typically is on workstations,
this reduces to

				-t/k
	average  =  average    e
	       t	   0

i.e., exponential decay.

The reason this is consistent across many systems is that it was done
at Berkeley for 4BSD and then copied into those systems.

In article <MYCROFT.91May31031208@goldman.gnu.ai.mit.edu>
mycroft@goldman.gnu.ai.mit.edu (Charles Hannum) writes:
>I noticed this a long time ago, while running xload.  For some reason,
>every 30 or 60 seconds, the load will suddenly jump and slowly decay
>on an otherwise idle machine. ... I've always attributed [these spikes]
>to 'update' ('syncer' on some systems), and ignored it.

This is almost certainly the correct explanation (/etc/update is counted
as runnable while waiting for the sync() system call, which it typically
issues once every 30 seconds).

In article <MEISSNER.91May31111801@curley.osf.org> meissner@osf.org
(Michael Meissner) writes:
>Another thing could be the activity to run the various xclock
>programs, and such.  I would imagine that on timesharing systems with
>lots of xterms, this could be significant.

xclock is particularly unlikely to add to the load average (although it
does add to the machine load!) because of a design misfeature in most
Unix systems.  The problem is that the system metering---the code that
computes the load average, cpu utilization for each process, and so
on---is run off the same clock as the scheduler.  Thus, at every clock
tick (or every n'th tick), we first see what is going on---nothing---
then we schedule the clock program, which runs for a short while and
goes back to sleep.

In particular, given the usage-sensitive CPU scheduling found in most
BSD-derived schedulers (which is to say every SunOS system through at
least SunOS 3.5, and probably 4.x as well), it is possible for a
program to use the clock to drive itself just after it is sampled as
sleeping, work until just before the next sample, and then go to sleep
waiting for the next clock tick.  By doing this it appears to use no
CPU time, hence gets fairly high priority (the kernel believes that it
has not got its fair share of CPU yet) and runs immediately on the next
clock tick, and thus is asleep again by the time the clock ticks
again.  This perpetuates the cycle.  Such a process can starve out
other processes.

The solution is simple but requires relatively precise clocks.
Fortunately such clocks exist on Sun SparcStations (unlike Sun-3s).
The 4BSD Sparc kernel will use them, once I get around to fixing that
part of the system.  (First I have to get running multi-user, now that
single user boots work, and write such minor [ahem] things as a frame
buffer driver and get enough going to make X run....  Sorry, Masataka,
but I intend to run X windows on *my* workstation, at least until
something better comes along. :-) )
-- 
In-Real-Life: Chris Torek, Lawrence Berkeley Lab CSE/EE (+1 415 486 5427)
Berkeley, CA		Domain:	torek@ee.lbl.gov