Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10.1 6/24/83; site ucbcad.UUCP Path: utzoo!linus!decvax!tektronix!ucbcad!ucbesvax.turner From: ucbesvax.turner@ucbcad.UUCP Newsgroups: net.arch Subject: uP caches, cont'd. - (nf) Message-ID: <1041@ucbcad.UUCP> Date: Thu, 15-Dec-83 01:17:17 EST Article-I.D.: ucbcad.1041 Posted: Thu Dec 15 01:17:17 1983 Date-Received: Sun, 11-Dec-83 01:07:13 EST Sender: notes@ucbcad.UUCP Organization: UC Berkeley CAD Group Lines: 51 #N:ucbesvax:27900003:000:2645 ucbesvax!turner Dec 8 12:23:00 1983 I don't like the idea of putting registers in an on-board cache memory (and then translating register references to full memory addresses). Some reasons why: - it increases the amount of control logic required to interpret a register reference. One must not only extract the reference, but add it to a full-address-space pointer, and hand it through the the cache-address translator. As we will see below, this might involve serializing register access--involving yet more control logic. - one advantage of a true register file is that one can use dual-ported memory to gain speed by allowing overlapped fetches. Making a whole cache (~256..~4K bytes) out of dual-ported memory would be rather expensive. The only other way to achieve overlapped fetching in the cache would be to interleave the cache RAM--and that's only a statistical speed-up. There will still be cases where register access must be serialized *unless* the interleave factor is equal to the number of registers. This seems like a high cost to pay just to get register- to-register operations that are (nearly) as fast as they are in processors that don't map registers to memory. One does NOT contort the design of a cache around the architecture! In fact, I am in favor of quite the opposite, for the special case of single-chip microprocessors: violate the rule of transparency to the extent of adding instructions that address issues of control and optimization of caches, then contort the compiler (somewhat) around these instructions. - runaway pointers can trash your whole context, making it very hard to debug programs with that problem. Sure you could trap such accesses if they were inappropriate. But again, that means clapping on some special frob to test for indirect addressing of register- mapped memory. With a special supervisor control bit, perhaps, so that one *can* do it when one wants to. And a partridge in a pear tree. It all adds up. Assuming that this discussion is concerned ONLY with the kind of cache one puts on a single-chip microprocessor, I think people should realize that you don't just say "oh, and let's add this". On a chip, everything steals something from everything else. (In a TTL design, maybe you just have to beef up the power supply a little to add new features. Eventually you run out of board space. Bare silicon is a rather different medium.) Don't let VLSI and its small packages fool you. To take full advantage of a million transistors on one die is going to be at least as hard as designing Cray machines. --- Michael Turner (ucbvax!ucbesvax.turner)