Path: utzoo!utgpu!jarvis.csri.toronto.edu!rutgers!sun-barr!ames!ncar!boulder!unicads!les From: les@unicads.UUCP (Les Milash) Newsgroups: comp.arch Subject: Re: Register usage Message-ID: <479@unicads.UUCP> Date: 2 Jun 89 16:07:54 GMT References: <259@mindlink.uucp> <25382@ames.arc.nasa.gov> <1RcY6x#64Zq3Y=news@anise.acc.com> <107302@sun.eng.sun.com> <1Rd5QS#7Pjn10=news@anise.acc.com> Reply-To: les@unicads.UUCP (Les Milash) Organization: Unicad Boulder, CO Lines: 49 In article <1Rd5QS#7Pjn10=news@anise.acc.com> lars@salt.acc.com (Lars J Poulsen) writes: >What I am talking about is saving and restoring registers when the >operating system switches from one user process to another. This is not >something that the compiler can improve on. Maybe the architecture could >define a "highest register currently in use" pointer, and encourage >context switches just before it gets incremented ? Somebody recently also pointed out that the INMOS Transputer can c-switch in .6uS. The way they do it is sort of like this but wierder; and it's different-but-related wrt this thread: there are 6 registers, a PC, a FP, and a 3-or-4 deep hardware stack. all local variables are in "memory", and they have fast FP+short offset addressing mode. current chips have (1-soon8 kWord) on-chip 50ns memory where you want to put your stack(s?). architecturally i think the chip could be done with NO on-chip (i.e. fixed address) ram but cache instead. in fact you can disable the on-chip bank if you want to, although i'm not sure there're the perfect set of signals coming out that cache users would want. context switches will only happen on certain instructions, (like loop bottom, and in/out (which might block). they do have a "highest register currently in use" sort of, but they assume that the compiler will make sure that NONE of them are in use at those points, it only saves the PC and FP. #define HIGHEST_USED_REGISTER 0 /* HA. so THERE. deal with THAT, mate! */ remember the TI 99*? before my time, but it's a memory-to-memory machine, with the "registers" being memory-mapped, FP relative. i heard that it was a dog, at least the micro version. has anybody considered memory-to memory architectures in the face of modern cache design? it seems to solve some problems (but probably introduce some, i just cann't see what right off the bat). somebody recently was wondering about if you do global register allocation, what if your most frequent variable had pointers to it, how do you handle that? this'll solve that at least, right? in america the transputer gets little attention (i guess cause it's got no mmu and it's not intel- or moto- compatable and cause inmos took about 5 years before they got anybody to understand what the h eck it was). but it is wierd, and in lots of ways good, and we can learn from it. it's risc not in the sense of "register-to-register with windows" but in the sense of "optimized in the face of real data", or at least they do constantly quote numbers about how "only 1.5% of these actually require those so we'll make you do it in software" and they were one of the first to make short immediate constants load fast. have load-store day. Les Milash