Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!tut.cis.ohio-state.edu!unmvax!polyslo!steve
From: steve@polyslo.CalPoly.EDU (Steve DeJarnett)
Newsgroups: comp.sys.sequent
Subject: Re: load average
Message-ID: <10847@polyslo.CalPoly.EDU>
Date: 28 Apr 89 08:29:51 GMT
References: <2470@helios.ee.lbl.gov> <67727@pyramid.pyramid.com> <15248@sequent.UUCP>
Reply-To: steve@polyslo.CalPoly.EDU (Steve DeJarnett)
Organization: Lab Rat Rumpus Room -- Cal Poly SLO
Lines: 85

In article <15248@sequent.UUCP> jjb@sequent.UUCP (Jeff Berkowitz) writes:
>In article <67727@pyramid.pyramid.com>, csg@pyramid.pyramid.com (Carl S. Gutekunst) writes:
>>There's a reason for that. Dynix divides the load average by the number of
>>CPUs you have. If uptime(1) displays 1.6, and you have four CPUs, then the
>>load average is really 6.4.
>"really"? :-)
>
>Very early in the history of DYNIX, Sequent experimented with both
>alternative implementations of load average.  The existing one was
>selected because it more accurately described the behavior that
>users perceived.

	Load average was (long ago) defined to be "the average number of jobs
in the Run queue over the last 1,5,15 minutes".  To quote directly from the 
Dynix Version 3.0.4 man page for 'w':

	The load average numbers give the number of jobs
     in the run queue averaged over 1, 5 and 15 minutes.

Are there multiple run queues on a Balance 8000??  I've never studied the
implementation of Dynix (lack of source makes it more difficult also :-), but
I'd suspect there's one run queue, and processors grab the next job eligible
when they're free.  Am I correct here??

	If so, then the notion of dividing the # of jobs in the run queue
by the number of processors to obtain load average is in conflict with what
the manuals say you're doing.  Of course, as has been pointed out, load averages
are merely subjective measurements of your system "response".  As we all know,
system "response" depends on a great number of things.  

	So, the question boils down to this:  Do you want to generate load 
averages "like the rest of the world" that reflects how many jobs are in your
run queue, and then have some added caveat of "but we have N processors to run
these M jobs on, so the effective load (or some such term as that) is really
M/N", or do you generate load averages the way you currently do with "load 
average is dependent on the number of processors AND the number of processes
(and, oh, therefore our load averages MAY or MAY NOT compare directly with 
those of machine X)".

	Personally, I prefer the former.  This gives you a way of comparing
Apples to Apples (figuratively, not literally).  If there's a load average of
7.5 on my Pyramid, and a load average of 7.5 on my Sequent, I would know that 
they are measuring the same thing.  Then if I log into my Sequent and find that
response time is faster (or slower), I would have a means of direct comparison
that is quantifiable (sp??).

	I realize that in the end, this whole thing boils down to a religious
issue over what you believe is "right".  I personally (if it wasn't already
apparent) believe that "number of jobs in the run queue" is the appropriate
measure.  That's just me, though.

	One last question.  When Sequent computes their load average, do they
take into account the possibility that some of the processors might not have
been available during the last 1,5,15 minutes??  If I have 2 processors running
user processes, but the Sequent is basing its calculation of load average on
10 processors (or how many there actually are in my system), then a load 
average based on that premise is not a truly representative number.

>How does a four processor 9845 handle load average?  I presume
>from your comment that Pyramid does not divide by the number of
>processors?  Does this mean performance does not scale linearly?

	I don't think they do on a 2 processor 98x, so I doubt things are that
different on a 9845 (our machine's kernel actually believes that it's a 9810,
but that's a totally different story).  Load average on a Pyramid (correct me
if I'm wrong, Carl) is "Average # of jobs in the run queue over the last 1, 5,
and 15 minutes".  The fact that you have 4 processors there to keep things 
going makes it all the better.

	One other question springs to mind here (sorry this is getting very 
long):  Given more processors to run jobs, won't the jobs that are there finish
(hopefully) sooner than they would on a system with fewer of the same 
processors, and therefore result in there being fewer jobs in the run queue at
any given moment in time overall??  This would seem to be another argument (if
it is indeed true) against Sequent's method of load average computation.

>Jeff Berkowitz N6QOM			uunet!sequent!jjb
>Sequent Computer Systems		Custom Systems Group

-------------------------------------------------------------------------------
| Steve DeJarnett            | Smart Mailers -> steve@polyslo.CalPoly.EDU     |
| Computer Systems Lab       | Dumb Mailers  -> ..!ucbvax!voder!polyslo!steve |
| Cal Poly State Univ.       |------------------------------------------------|
| San Luis Obispo, CA  93407 | BITNET = Because Idiots Type NETwork           |
-------------------------------------------------------------------------------