Path: utzoo!utgpu!news-server.csri.toronto.edu!helios.physics.utoronto.ca!ists!yunexus!davecb From: davecb@yunexus.YorkU.CA (David Collier-Brown) Newsgroups: comp.arch Subject: Re: Page size and linkers Message-ID: <21655@yunexus.YorkU.CA> Date: 10 Feb 91 16:57:49 GMT References: <45242@mips.mips.COM> Organization: York U. Computing Services Lines: 77 On 4 Feb 91 19:09:49 GMT, md@HQ.Ileaf.COM (Mark Dionne x5551) said: md> I have done some experiments with Interleaf (a large publishing md> program written in C), and found that for many usage patterns, md> typically about 25% of the code that is paged-in is actually md> touched. This can mean that up to 2 meg of memory is often being md> "wasted". pcg@cs.aber.ac.uk (Piercarlo Grandi) writes: | Not unsurprising; there are lots of statistics that say that on average | only 200 instructions are executed before a "long jump" is executed; | this means that small page sizes tend to reduce working sets | dramatically. [...] A large amount | of statistics already exists on this subject, but they are at times | twenty years old. It would be interesting to see them confirmed with | data on more recent applications/languages. Well, some of the twenty-year-old data came from a (pre)linker on ICL's George III, which I remember fuzzily... Much more recently (5 years ago) I had the doubtful pleasure of linking C code for a large, multi-module program for the IBM Poisonous Computer while working on Xanaro's ``Ability''. We were using a program which evolved from an overlay scheme, but had been rewritten into a pager with the same interface by one of our senior people, Mr. Andrew Forber. This didn't actually generate statistics, but did allow us to manually reorder code by locality of reference. Since we were writing classical ``structured'' code and had a compiler which implemented static-to-file by placing multiple functions in a single .o file, you understand we tended to discover the dynamic call graph of the programs very quickly... The 200 instruction figure was approximately correct for all non-unrolled code, and quite low for things which we found on the critical path and expanded inline (such as file i/o, paging and the compiler/interpreter engine). About 20% of the code was initializers of some sort or another, and could usefully be moved to ``pages'' which were loaded at startup time, called by the main() routine and then were never referenced again. As the system was partitioned into several large modules, we found three definite orders of locality of reference. The weakest was within the module as a whole: the user interface used the user-visible operations which used the primitives. The second weakest was within the primitives, many of which interacted in interesting ways (eg, to preserve consistency), ad tended to call each other. The strongest was between the user-level operations and the primitives. A given user-level operation tended to call the same set of primitives over and over again. This was true of all the modules, not just the ones which were ``similar''. Indeed, we noticed a contraintuitive effect: some few user-level operations called primitives which we expected only to be called by other modules entirely! The text editor often found it's file i/o operations calling primitives bound with the spreadsheet, because of a reference from one file's data to another. We first discovered this when the speed of file i/o suddenly fell to about 1 line/sec, accompanied by much thrashing when two routines started fighting over the same ``page''. Breaking the modules up into smaller and smaller ``pages'' yielded order-of-magnitude improvements in performance. Hand-tuning generally gave us two orders of magnitude (or more correctly, cost us two orders of magnitude when I did it wrong (:-)). This was one of the ``small'' things referred to by Dick McMurray's dictum: First you make it right, then you make it fast, then you make it **small**. --dave -- David Collier-Brown, | davecb@Nexus.YorkU.CA | lethe!dave 72 Abitibi Ave., | Willowdale, Ontario, | Even cannibals don't usually eat their CANADA. 416-223-8968 | friends.