Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!sdd.hp.com!elroy.jpl.nasa.gov!ames!uhccux!munnari.oz.au!yoyo.aarnet.edu.au!sirius.ucs.adelaide.edu.au!hydra!francis
From: francis@cs.ua.oz.au (Francis Vaughan)
Newsgroups: comp.sys.encore
Subject: Re: Multimax thrashing
Message-ID: <2195@sirius.ucs.adelaide.edu.au>
Date: 20 Dec 90 06:49:48 GMT
References: <1990Dec3.170300.14750@newcastle.ac.uk> <130064@infocenter.encore.com> <2166@sirius.ucs.adelaide.edu.au> <142@mx-1>
Sender: news@ucs.adelaide.edu.au
Reply-To: francis@cs.adelaide.edu.au
Organization: Adelaide Univerity, Computer Science
Lines: 112
Nntp-Posting-Host: hydra.ua.oz.au

In article <142@mx-1>, ar@mcdd1 (Alastair Rae) writes:
|> francis@cs.ua.oz.au (Francis Vaughan) writes:
|> 
|> > ... 
|> > I wonder if some of the memory thrashing obseved is due to the problem in
|> > the memory manager that locks down copy-on-write pages. Any sign of
this being
|> > fixed? Our perception (on our machine anyway) is that this is
costing us about
|> > half the performance of our machine. We are not happy. 
|> > ...
|> 
|> I hadn't heard of this problem but I'm very interested to find out more.
|> When you pay through the nose for a big *nix box, you expect big
performance!
|> I've had lots of fun :-( trying to tune our box and had the feeling
|> that something was wrong somewhere.
|> 
|> Could you elaborate, please, Francis?

No problem.

Most of this was covered in a posting from Gordon Irlam 
(gordoni@cs.adelaide.edu.au) on the 18th of September to comp.sys.encore. 
If any one really wants the entire thing again I will either repost, if
there are a few requests, or individually forward it, if there are only a very 
few. The full posting includes a few ideas to help mitigate the impact and 
an example program to really screw your machine as well.

We reported this problem to Encore in April, and have heard nothing since. In 
conversation with our local software support people, I was told last week that
Encore was satisfied with the design of the memory system and had no intention
of fixing the problem. I would love to be told otherwise.

Interested folk with source should look in the routine ageregion_sh in
the file 
sys_x.x/sys/vm_pageout.c where x.x is your release version.


This is a small precis.


Umax 4.3 release 4.0.0, and all previous releases of BSD Umax, contain
a serious bug in the virtual memory system that prevents it from being
able to page out pages of processes under certain commonly occurring
circumstances.  This degrades system performance.  Or equivalently
increases the amount of physical memory needed to obtain a given level
of performance.  In more extreme cases it may cause severe performance
problems or even deadlock.

Umax 4.3 is not able to page out copy on write pages.  The meaning of
this and its ramifications are explained below.

When a process forks under Umax all of the modifiable pages of the
parent process are marked copy on write.  The same set of pages are
marked copy on write in the child process.  Because code pages are
read only they can be shared without being marked copy on write.
Marking a page copy on write means setting its protection to read
only, and then if a write to that page causes a translation fault a
copy of the page is made, the protection on the page is set to
read-write, and the faulting instruction re-executed.  Copy on write
pages minimize the cost of forking.

If when a copy on write fault occurs the copy on write page is no
longer shared with any other processes, say because the child has
exited, the page will be set to read-write without the needing to make
a copy of the page.  Note that this final giving away of a copy on
write page is not performed as soon as the page becomes owned by a
single process, but only when the last owner of the page writes to it.
If the last owner never writes to the page it will remain copy on
write despite the fact that it is not shared with anyone else.

Fortunately many processes,
    1) do not fork, or
    2) fork but have a reasonably small amount of data, or
    3) shortly after forking both child and parent,
           a) exit, or
           b) exec, or
           c) modify nearly all their data pages, or
    4) only access a few pages data pages, immediately prior to
       forking, and then only read a few data pages at any time
       subsequent to forking.

Those cases where these constraints are not met cause the most
problems, and to a certain extent case 4 can also cause problems.  In
case 4 where a process only touches a few pages immediately prior to
forking, if the system was heavily loaded at the time prior to the
fork, most pages will have been swapped out, and so will not end up
being locked down by the fork - unless they are subsequently read in.
But if the system was lightly loaded at the time of the fork then case
4 will still cause a large number of pages to be locked down.

Our experience is that we can not use much more swap space than twice
the physical memory on our machines, even though many of our processes
are idle for substantial periods of time.

We had considerable difficulty when we attempted to use a Multimax as
a server for a large number of X terminals.  The machine had
sufficient compute power, virtual, and physical memory for the
clients, but nearly all of the physical memory filled up with
non-pageable copy on write pages, that weren't even being used.
Unfortunately the xterm binary was both long lived and caused a large
number of pages to be locked down for long periods of time.

Identifying the problem is fairly easy.  Sysparam will be showing the
system paging heavily, but when you do a ps you will find some pages
of processes remain in memory, even when they are idle or stopped.  In
more severe cases all of the system's memory may end up becoming
non-pageable, preventing you from even being able to login.


						Francis Vaughan