Path: utzoo!utgpu!news-server.csri.toronto.edu!bonnie.concordia.ca!thunder.mcrcim.mcgill.edu!snorkelwacker.mit.edu!usc!samsung!rex!ukma!seismo!uunet!mcsun!ukc!dcl-cs!aber-cs!athene!pcg
From: pcg@cs.aber.ac.uk (Piercarlo Grandi)
Newsgroups: comp.arch
Subject: Re: Page size and linkers (was: Re: SunMMU history)
Message-ID: <PCG.91Jan23155035@athene.cs.aber.ac.uk>
Date: 23 Jan 91 15:50:35 GMT
References: <1991Jan19.133914.23871@bellcore.bellcore.com>
	<PCG.91Jan20191955@teacho.cs.aber.ac.uk> <3981@skye.ed.ac.uk>
Sender: aro@aber-cs.UUCP
Organization: Coleg Prifysgol Cymru
Lines: 87
Nntp-Posting-Host: athene
In-reply-to: richard@aiai.ed.ac.uk's message of 21 Jan 91 13:56:48 GMT

On 21 Jan 91 13:56:48 GMT, richard@aiai.ed.ac.uk (Richard Tobin) said:

richard> In article <PCG.91Jan20191955@teacho.cs.aber.ac.uk>
richard> pcg@cs.aber.ac.uk (Piercarlo Grandi) writes:

pcg> Incidentally, even a 4KB pagesize is still too large, even if it is
pcg> barely tolerable. 8KB was simply crazy, for a workstation/time sharing
pcg> server.

richard> Large page sizes seem likely to interact badly with programs
richard> containing a large number of small procedures,

Same for data. Other things being equal, a smaller page size is better
than a larger page size, down to a fairly small limit, mostly for this
reason. Naturally other things are not equal -- smaller page sizes imply
larger page tables, and IO costs are largely independent of page size.

richard> such as X clients.

Ah, also the X server. By private communication I have been told that
the X server typically spends 90% of its time in a 2KB stretch of code.
My reaction was disbelief, as this would imply that the X server code
working set would be around one page, which observations tells us is
not.

I was told that in getting to those 2KB, control touches a large number
of layers of abstraction realized as widely scattered procedures.  In
other words the path leading to the 2KB is fairly short in time but long
in jumps from one page to another... I would surmise that the same is
probably true for the X server's data. Arrrrggghhh.

richard> An obvious improvement would be to have the linker order
richard> procedures so that procedures commonly used together were
richard> adjacent, and separate from rarely used procedures.  Has this
richard> been done in any real systems?

Another lost art! Yes, it used to be popular before Unix. I have seen
several papers on the subject, late sixties to end of the seventies. You
would have tools for profiling and rearranging code in order to minimize
the working set. I have even heard of an Algol 68 compiler with an
"infrequently" pragma, used to tag infrequently executed sections of
code to offline.

As I have already remarked, programming for "locality" is a dark lost
secret nowadays -- we have got the worst aspects of Lisp programming in
languages like C without the benefits!

Memory is cheap, so instead of using it for more powerful applications
it is used for sloppier programming.

I had once a short e-mail discussion with the author on the appalling
inefficiencies intrinsic in the GNU Emacs design decisions. The usual
answer was "who cares! modern machines are so fast!".

The problem is that the GNU Emacs botches do not have a constant cost;
that is, say an overhead of 100KB or of 1 million needless instructions,
which would become percentage wise insignificant with time. No, like
many sloppily written programs it has inefficiencies proportional to the
size and speed of the machine/application. And applications grow in
size; once one could complain about GNU Emacs being slow in editing
100KB buffers and running 200 line ELisp functions on a 68010. Now that
we have machines that are 20 times as fast and large, if we still edit
100KB files and run 200 Elisp functions GNU Emacs looks fast.

Unfortunately now we want to edit 2MB files and run 4000 lines ELisp
functions routinely, and GNU Emacs still looks slow as it was then, or
even more, because the inefficiency overhead is more than linear. So we
have the choice of investing the 20 fold improvement in machine power to
have either decent responsiveness running the same things we did with
20 times less powerful machines or running applications that are 20
times as large at the same old slow pace.

Another example is the abject System V expansion swap technology; the
"cure" suggested by AT&T is to have about 10 times more memory than that
is needed for the working sets of active programs, so that the misdesign
is never exercised. OK, memory is cheap -- but the *factor* of 10
remains there, whatever the application size is, and it irks me for some
stupid reason that I have to reserve 90% of my real memory to AT&T's
swapping inanity, instead of for running applications that are ten times
as large.

So, the figleaf "machines are getting faster" really does not cover
much...
--
Piercarlo Grandi                   | ARPA: pcg%uk.ac.aber.cs@nsfnet-relay.ac.uk
Dept of CS, UCW Aberystwyth        | UUCP: ...!mcsun!ukc!aber-cs!pcg
Penglais, Aberystwyth SY23 3BZ, UK | INET: pcg@cs.aber.ac.uk