Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!sdd.hp.com!elroy.jpl.nasa.gov!ames!uhccux!munnari.oz.au!yoyo.aarnet.edu.au!sirius.ucs.adelaide.edu.au!hydra!francis From: francis@cs.ua.oz.au (Francis Vaughan) Newsgroups: comp.sys.encore Subject: Re: Multimax thrashing Message-ID: <2195@sirius.ucs.adelaide.edu.au> Date: 20 Dec 90 06:49:48 GMT References: <1990Dec3.170300.14750@newcastle.ac.uk> <130064@infocenter.encore.com> <2166@sirius.ucs.adelaide.edu.au> <142@mx-1> Sender: news@ucs.adelaide.edu.au Reply-To: francis@cs.adelaide.edu.au Organization: Adelaide Univerity, Computer Science Lines: 112 Nntp-Posting-Host: hydra.ua.oz.au In article <142@mx-1>, ar@mcdd1 (Alastair Rae) writes: |> francis@cs.ua.oz.au (Francis Vaughan) writes: |> |> > ... |> > I wonder if some of the memory thrashing obseved is due to the problem in |> > the memory manager that locks down copy-on-write pages. Any sign of this being |> > fixed? Our perception (on our machine anyway) is that this is costing us about |> > half the performance of our machine. We are not happy. |> > ... |> |> I hadn't heard of this problem but I'm very interested to find out more. |> When you pay through the nose for a big *nix box, you expect big performance! |> I've had lots of fun :-( trying to tune our box and had the feeling |> that something was wrong somewhere. |> |> Could you elaborate, please, Francis? No problem. Most of this was covered in a posting from Gordon Irlam (gordoni@cs.adelaide.edu.au) on the 18th of September to comp.sys.encore. If any one really wants the entire thing again I will either repost, if there are a few requests, or individually forward it, if there are only a very few. The full posting includes a few ideas to help mitigate the impact and an example program to really screw your machine as well. We reported this problem to Encore in April, and have heard nothing since. In conversation with our local software support people, I was told last week that Encore was satisfied with the design of the memory system and had no intention of fixing the problem. I would love to be told otherwise. Interested folk with source should look in the routine ageregion_sh in the file sys_x.x/sys/vm_pageout.c where x.x is your release version. This is a small precis. Umax 4.3 release 4.0.0, and all previous releases of BSD Umax, contain a serious bug in the virtual memory system that prevents it from being able to page out pages of processes under certain commonly occurring circumstances. This degrades system performance. Or equivalently increases the amount of physical memory needed to obtain a given level of performance. In more extreme cases it may cause severe performance problems or even deadlock. Umax 4.3 is not able to page out copy on write pages. The meaning of this and its ramifications are explained below. When a process forks under Umax all of the modifiable pages of the parent process are marked copy on write. The same set of pages are marked copy on write in the child process. Because code pages are read only they can be shared without being marked copy on write. Marking a page copy on write means setting its protection to read only, and then if a write to that page causes a translation fault a copy of the page is made, the protection on the page is set to read-write, and the faulting instruction re-executed. Copy on write pages minimize the cost of forking. If when a copy on write fault occurs the copy on write page is no longer shared with any other processes, say because the child has exited, the page will be set to read-write without the needing to make a copy of the page. Note that this final giving away of a copy on write page is not performed as soon as the page becomes owned by a single process, but only when the last owner of the page writes to it. If the last owner never writes to the page it will remain copy on write despite the fact that it is not shared with anyone else. Fortunately many processes, 1) do not fork, or 2) fork but have a reasonably small amount of data, or 3) shortly after forking both child and parent, a) exit, or b) exec, or c) modify nearly all their data pages, or 4) only access a few pages data pages, immediately prior to forking, and then only read a few data pages at any time subsequent to forking. Those cases where these constraints are not met cause the most problems, and to a certain extent case 4 can also cause problems. In case 4 where a process only touches a few pages immediately prior to forking, if the system was heavily loaded at the time prior to the fork, most pages will have been swapped out, and so will not end up being locked down by the fork - unless they are subsequently read in. But if the system was lightly loaded at the time of the fork then case 4 will still cause a large number of pages to be locked down. Our experience is that we can not use much more swap space than twice the physical memory on our machines, even though many of our processes are idle for substantial periods of time. We had considerable difficulty when we attempted to use a Multimax as a server for a large number of X terminals. The machine had sufficient compute power, virtual, and physical memory for the clients, but nearly all of the physical memory filled up with non-pageable copy on write pages, that weren't even being used. Unfortunately the xterm binary was both long lived and caused a large number of pages to be locked down for long periods of time. Identifying the problem is fairly easy. Sysparam will be showing the system paging heavily, but when you do a ps you will find some pages of processes remain in memory, even when they are idle or stopped. In more severe cases all of the system's memory may end up becoming non-pageable, preventing you from even being able to login. Francis Vaughan