Path: utzoo!utgpu!jarvis.csri.toronto.edu!cs.utexas.edu!samsung!uakari.primate.wisc.edu!uflorida!mephisto!ukma!rutgers!att!cbnewsh!dwc
From: dwc@cbnewsh.ATT.COM (Malaclypse the Elder)
Newsgroups: comp.unix.wizards
Subject: Re: Gripe about mickey-mouse VM behaviour on many Unixes
Message-ID: <7760@cbnewsh.ATT.COM>
Date: 30 Jan 90 02:42:52 GMT
References: <22105@adm.BRL.MIL> <1424@eutrc3.urc.tue.nl>
Organization: The Legion of Dynamic Discord
Lines: 66

In article <1424@eutrc3.urc.tue.nl>, wsinpdb@eutws1.win.tue.nl (Paul de Bra) writes:
> In article <22105@adm.BRL.MIL> Ed@alderaan.scrc.symbolics.com (Ed Schwalenberg) writes:
> >...
> >And if you don't have enough, you lose just as badly.  Under System V
> >Unix for the 386, when your large process exceeds the amount of
> >non-wired physical memory, the paging algorithm pages out the ENTIRE
> >process (which takes a LONG time), then lets your poor process fault
> >itself in again, oh so painfully, until you exceed physmem again and
> >start the cycle over.
> 
> This most certainly is not true.
> I have experimented with growing processes and what really happens is
> that when the total size of all processes approaches physical memory
> size the pager starts to page out some old pages. I can have a process
> grow slowly and never really be paged (or swapped) out completely.
> (I have tried this with a 20Mbyte process on a machine with 8Mbyte of
> memory).
> 
> However, if a process is using the standard malloc() routine to allocate
> memory, then in order to allocate more memory malloc will search through
> a linked list of pointers, which are scattered throughout your process'
> memory. This usually involves a lot of paging, and it indeed is possible
> that all pages of a process are paged out, while other pages (of the
> same process) are being paged in. I have observed this behaviour with
> a process that exceeded physical memory only by a small margin.
> The solution is to use the routines in libmalloc, which do not use the
> scattered linked list of pointers. Switching to libmalloc completely
> stopped the thrashing.
> 
> The malloc() routine in BSD does not use the linked list approach either,
> so a growing process does not cause this kind of thrashing in BSD.
> 
i'm not sure about the user side of things (e.g. malloc) but i think what
the original poster was referring to was the fact that in system v release 3,
if a page fault could not find any free physical memory, the faulting
process would roadblock and put the process in SXBRK state.  the memory
scheduler, sched, would then be awaken to swap out a process(es).
note that when this happens, it is VERY LIKELY for other processes
to also roadblock on memory in the same state.  i use the "keystone cops"
as the visualization aid for this effect.  this is the reason by SVR3
could go into idle state on a busy system (if you examine sar output for
a memory overloaded system).  but i digress.

i don't remember what the final method was for handling the case of
a single process on the run queue faulting and using more than physical
memory but one iteration of it had the memory scheduler do nothing
in that situation.  it would then be up to the paging daemon to steal
pages from that process (page aging was done according to wall clock time).

note that the swapping was not done in a single i/o operation but was
ultimately broken up by at least the device driver into track-size pieces.
but that doesn't address the problem of latency.  the process was subject
to latency on swapping and would have to painfully page its 'working set'
back in.  it would certainly make more sense to swap out in convenient
size pieces, leaving the process ineligible to run (don't want a process
that is being swapped out continuing to contend for pages) until either
the memory shortage cleared up or the entire process was swapped out.

this idea was incorporated into a regions-based prototype designed to
handle memory contention, load control, and page replacement in a more
sane manner.  of course, with SVR4, the regions architecture went out
the window and we would have to redesign a prototype based on VM (not
an easy task).  we may eventually do it though.

danny chen
att!hocus!dwc