Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!rutgers!apple!amdcad!rpw3 From: rpw3@amdcad.AMD.COM (Rob Warnock) Newsgroups: comp.arch Subject: Re: 80486 vs. 68040 code size [really: how many regs] Message-ID: <25662@amdcad.AMD.COM> Date: 17 May 89 17:55:53 GMT References: <950@aber-cs.UUCP> Reply-To: rpw3@amdcad.UUCP (Rob Warnock) Distribution: eunet,world Organization: [Consultant] San Mateo, CA Lines: 43 In article <950@aber-cs.UUCP> pcg@cs.aber.ac.uk (Piercarlo Grandi) writes: +--------------- | This is a very good argument, so far, for an AMD 29k style very large | register file, that becomes a statically managed first level memory, or for | a SPARC style set of (less statically managed) windows... +--------------- Hmmm... the 29k register windowing seems to have been misunderstood again... On the Am29000, while the local registers (the 128 "windowed" or "stack cache" regs) are statically *named* (on entry to a routine, lr0 = return address, lr2 = first arg, etc.), the implied "first level memory" (when the register set is considered as a stack cache) is *dynamically* managed according to the instantaneous depth of the call stack by the routine entry and exit assertions. That is, there is no predetermined correlation between subroutine calls and spill/fill activity -- some regs are spilled (saved to memory) when a new frame is opened if the cache is full, and regs are filled (restored from memory) when a return is made to an upper-level routine whose frame was spilled (i.e., stack cache is "empty"). Since most frames are much smaller than the register file, there's *lots* of hysteresis. Also, the decrementing of the stack pointer (gr1) on routine entry (to open a variable-sized frame of 2 to 126 regs) accomplishes a dynamic remapping of which underlying physical registers the static local register names will access. [The "mapping" is of course trivial: bits <8:2> of "gr1" are added to the local register number (modulo 128) to give physical register number.] Since the overlapping windows [incoming args are in both the caller's and callee's frames] may be of any size (2-126), the 29k's windows are much "less statically managed" than the SPARC's. Even if the 29k had the same number of registers available to the windowing mechanism (instead of many more), it would still typically allow more routines' windows to be in regs at once, reducing memory traffic. Rob Warnock Systems Architecture Consultant UUCP: {amdcad,fortune,sun}!redwood!rpw3 DDD: (415)572-2607 USPS: 627 26th Ave, San Mateo, CA 94403