Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!cs.utexas.edu!uunet!portal!cup.portal.com!mmm From: mmm@cup.portal.com (Mark Robert Thorson) Newsgroups: comp.arch Subject: Re: flexible caches Message-ID: <22268@cup.portal.com> Date: 18 Sep 89 20:43:51 GMT References: <224@qusunr.queensu.CA> <22151@cup.portal.com> <22211@cup.portal.com> <2125@munnari.oz.au> Organization: The Portal System (TM) Lines: 48 ok@cs.mu.oz.au (Richard O'Keefe) says: > Fact 3: the kind of information Thorson envisaged his hypothetical nets > learning is very complex: this variable ought to be cached NOW > even though it shouldn't have been cached a few seconds ago. His whole argument hinges on this point. I suppose I must agree that at the byte level this is true. The storage of the synaptic weights and the current level of stimulation will occupy more silicon area than the data itself, so your money is better spent on more cache. But for page-level cache control, I think the argument is still valid. Here's a crude sketch of what I have in mind: 1) Equip the CPU with a mechanism for measuring its own performance. This might require the compiler to insert code which gives the program a "heartbeat". Also provide an indication of which mode a program is in; the compiler could insert code to load an immediate into a register whenever a call to one of the major functional blocks of your program occurs; the mode would be reported in a register. For each mode, the performance of the program during the previous run would be available. This also could be a register loaded during a call. Current performance would be available in a register maintained by hardware. 2) Extend the page table entries to support synaptic weights which associate the caching effectiveness of a page with the mode a program is in. For 16 modes, 16 bytes of synaptic weights would be enough. (I could argue that 16 nibbles would be enough.) 3) When current performance is better than previous performance, uptick the synaptic values for all cached pages. When current performance is worse, downtick the synaptic values. 4) Equip the CPU with sufficient neurons for the pages in the working set. The neuron model should support both spatial and temporal summation. When a reference is made to an uncached page, it replaces a cached page if there is a cached page for which the neuron output is below a threshold value. It may be necessary to inject a little random noise into the neurons to make sure every page gets a chance to improve execution performance. Pages eventually will get high synaptic values for the modes in which they contribute to performance, hence will be difficult to push out of the cache. Pages which compete with the winners will eventually get low (or even negative) synaptic values, so will be vunerable to being pushed out.