Path: utzoo!utgpu!news-server.csri.toronto.edu!bonnie.concordia.ca!thunder.mcrcim.mcgill.edu!snorkelwacker.mit.edu!usc!samsung!rex!ukma!seismo!uunet!mcsun!ukc!dcl-cs!aber-cs!athene!pcg From: pcg@cs.aber.ac.uk (Piercarlo Grandi) Newsgroups: comp.arch Subject: Re: Page size and linkers (was: Re: SunMMU history) Message-ID: Date: 23 Jan 91 15:50:35 GMT References: <1991Jan19.133914.23871@bellcore.bellcore.com> <3981@skye.ed.ac.uk> Sender: aro@aber-cs.UUCP Organization: Coleg Prifysgol Cymru Lines: 87 Nntp-Posting-Host: athene In-reply-to: richard@aiai.ed.ac.uk's message of 21 Jan 91 13:56:48 GMT On 21 Jan 91 13:56:48 GMT, richard@aiai.ed.ac.uk (Richard Tobin) said: richard> In article richard> pcg@cs.aber.ac.uk (Piercarlo Grandi) writes: pcg> Incidentally, even a 4KB pagesize is still too large, even if it is pcg> barely tolerable. 8KB was simply crazy, for a workstation/time sharing pcg> server. richard> Large page sizes seem likely to interact badly with programs richard> containing a large number of small procedures, Same for data. Other things being equal, a smaller page size is better than a larger page size, down to a fairly small limit, mostly for this reason. Naturally other things are not equal -- smaller page sizes imply larger page tables, and IO costs are largely independent of page size. richard> such as X clients. Ah, also the X server. By private communication I have been told that the X server typically spends 90% of its time in a 2KB stretch of code. My reaction was disbelief, as this would imply that the X server code working set would be around one page, which observations tells us is not. I was told that in getting to those 2KB, control touches a large number of layers of abstraction realized as widely scattered procedures. In other words the path leading to the 2KB is fairly short in time but long in jumps from one page to another... I would surmise that the same is probably true for the X server's data. Arrrrggghhh. richard> An obvious improvement would be to have the linker order richard> procedures so that procedures commonly used together were richard> adjacent, and separate from rarely used procedures. Has this richard> been done in any real systems? Another lost art! Yes, it used to be popular before Unix. I have seen several papers on the subject, late sixties to end of the seventies. You would have tools for profiling and rearranging code in order to minimize the working set. I have even heard of an Algol 68 compiler with an "infrequently" pragma, used to tag infrequently executed sections of code to offline. As I have already remarked, programming for "locality" is a dark lost secret nowadays -- we have got the worst aspects of Lisp programming in languages like C without the benefits! Memory is cheap, so instead of using it for more powerful applications it is used for sloppier programming. I had once a short e-mail discussion with the author on the appalling inefficiencies intrinsic in the GNU Emacs design decisions. The usual answer was "who cares! modern machines are so fast!". The problem is that the GNU Emacs botches do not have a constant cost; that is, say an overhead of 100KB or of 1 million needless instructions, which would become percentage wise insignificant with time. No, like many sloppily written programs it has inefficiencies proportional to the size and speed of the machine/application. And applications grow in size; once one could complain about GNU Emacs being slow in editing 100KB buffers and running 200 line ELisp functions on a 68010. Now that we have machines that are 20 times as fast and large, if we still edit 100KB files and run 200 Elisp functions GNU Emacs looks fast. Unfortunately now we want to edit 2MB files and run 4000 lines ELisp functions routinely, and GNU Emacs still looks slow as it was then, or even more, because the inefficiency overhead is more than linear. So we have the choice of investing the 20 fold improvement in machine power to have either decent responsiveness running the same things we did with 20 times less powerful machines or running applications that are 20 times as large at the same old slow pace. Another example is the abject System V expansion swap technology; the "cure" suggested by AT&T is to have about 10 times more memory than that is needed for the working sets of active programs, so that the misdesign is never exercised. OK, memory is cheap -- but the *factor* of 10 remains there, whatever the application size is, and it irks me for some stupid reason that I have to reserve 90% of my real memory to AT&T's swapping inanity, instead of for running applications that are ten times as large. So, the figleaf "machines are getting faster" really does not cover much... -- Piercarlo Grandi | ARPA: pcg%uk.ac.aber.cs@nsfnet-relay.ac.uk Dept of CS, UCW Aberystwyth | UUCP: ...!mcsun!ukc!aber-cs!pcg Penglais, Aberystwyth SY23 3BZ, UK | INET: pcg@cs.aber.ac.uk