Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!usc!zaphod.mps.ohio-state.edu!mips!sgi!shinobu!odin!sgihub!dragon!putter.wpd.sgi.com!bean From: bean@putter.wpd.sgi.com (David (Bean) Anderson) Newsgroups: comp.arch Subject: Re: registerless architecture Message-ID: <1990Nov13.035859.4777@relay.wpd.sgi.com> Date: 13 Nov 90 03:58:59 GMT References: <1990Nov12.145410.29035@cs.cmu.edu> Sender: news@relay.wpd.sgi.com ( CNews Account ) Reply-To: bean@putter.wpd.sgi.com (David (Bean) Anderson) Organization: Silicon Graphics Inc. Lines: 58 In article <1990Nov12.145410.29035@cs.cmu.edu>, spot@WOOZLE.GRAPHICS.CS.CMU.EDU (Scott Draves) writes: |> |> Has anyone every thought about or done a registerless architecture? |> registers, after all, are just a sort of cache, another level in the |> memory hierarchy. but a fixed size, hard-wired one. Consider |> a machine with a 4 level memory |> |> 0) the fpu and alu 0Kb |> 1) on-chip cache 10Kb |> 2) normal cache 100Kb |> 3) main ram 10 000Kb |> 4) magnetic disk 100 000Kb |> |> It is very easy expand the size/speed of caches, but not to add registers. |> I think this is a big advantage. The way a cache works generalizes |> the behavior things like register windows. |> |> One problem is that instructions would have to be very large (3 addresses). |> using a stack based approach would help. The 3 addresses are then |> relative to the stack pointer, and can be small enough to fit into the |> instruction. That's 8 or 9 bits for 32 bit machines, or twice that |> for 64 bit machines. again, it scales easily. |> |> context switch is fast and easy, there's nothing but CCR, PC, and FP. |> |> any thoughts on this? stupid idea, or the wave of the future? :) | 1. Register files are typically multi-ported -- one can usually get two reads and one write to the file in one clock (indeed, usually in a small fraction of the clock) -- whereas a cache typically is single ported and while it can deliver one data item per clock, it is usually the "next" clock. Caches will always be slower than registers because (if for no other reason) the path length and gate count to cache will be higher than to a register file. 2. Why are registers considered a *problem*? Modern compilers usually do a good job of effectively using the registers as opposed to *stupid* cache hardware. Indeed, some interesting work in "blocking algorithms" (faking the cache into behaving like a large register file) have resulted in some impress performance figures. 3. The HP3000 is a stack machine with no GPRs. The hardware (on some models) would keep the top four stack items in a register file in order to increase performance. 4. Register window architectures are an interesting compromise. They use a large register file that the compiler can use as it sees fit, however, one can address registers either by name or relative to the window base. Who decides what data items should go in high speed memory is the critical issue: hardware implemented heuristics (cache) or compiler handled directives (registers)? There are places for both. Bean