Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!cornell!uw-beaver!rice!titan!retrac From: retrac@titan.rice.edu (John Carter) Newsgroups: comp.arch Subject: Re: How Caches Work Message-ID: <1174@brazos.Rice.edu> Date: 13 Sep 89 01:57:42 GMT References: <21936@cup.portal.com> <1082@cernvax.UUCP> <16306@watdragon.waterloo.edu> <8399@boring.cwi.nl> <3989@phri.UUCP> Sender: root@rice.edu Reply-To: retrac@titan.rice.edu (John Carter) Organization: Rice University, Houston Lines: 68 Keywords: adaptive caching, Munin In article <3989@phri.UUCP> roy@phri.UUCP (Roy Smith) writes: > > Here's a (possibly crazy) idea for cache design. [Stuff deleted.] > > What if you segmented the virtual memory space (Oh no! Not >segmented address spaces again! Shades of Intel!) so that the top bit was >a hint to the cache on probably access patterns. Variables which were >expected to hit the cache a lot (SUM and I in the EUD) would be put in the >"normal" part of the address space. Variables which were expected to be >sequential access and thus never hit (VEC in the EUD) would be put in the >other half of the address space. The cache would know to not bother doing >a tag match on this kind of access. The advantages would be faster access >time (a memory fetch should be faster than a cache miss followed by a memory >fetch) but more important it wouldn't cause bogus cache flushes. > > As with every crazy idea, you can expand on this in all sorts of >ways. You might use more than one high order bit to provide lots of >different sorts of hints to the cache. You might want to be able to turn >the hinting on and off, or even make it programable. Of course, every bell >and whistle adds cost and complexity, and reduces the chance that you will >use the fancy features well, or at all. > > So what do you think? Has this been done before? I don't know of anybody else doing exactly this, but I/we have been examining something similar here at Rice. Our motivation for examining what I call adaptive caching schemes (adaptive because the caching mechanism adapts to the way the shared data is being accessed) is in the design of a `highly efficient' distributed shared memory system (I put highly efficient in quotes because we're just starting to move out from paper design, and while efficiency is a major goal, I don't know how much we'll succeed until we actually get some hard numbers). Implementing distributed shared memory is quite similar to implementing a multiprocessor cache (ref. work by Kai Li among others), but has a few important differences, including much longer latency to `main memory' and lower bus bandwidth (which may or may not support efficient broadcast, depending on your target distributed memory multiprocessor). The high cost of `page faults' in such a system motivated us to study ways to significantly reduce the number of `page faults' by taking advantage of semantic information either derived automatically by the compiler or specified by the user. Since distributed shared memory systems are essentially software-implemented caches, you can implement a more complicated caching mechanism than can be reasonably implemented in hardware. I don't have the time right now to go into details (both Sigmetrics and PPoPP have immediate deadlines that we're shooting for), but our basic idea is that you can characterize different common `types' of shared memory objects (data items), provide multiple cache coherence mechanisms designed to efficiently support these different `types', and have each shared object be handled with a caching mechanism tuned to its needs. The current thrust of the work has been two-fold: characterizing the way shared memory is accessed (somewhat related to earlier work by Weber & Gupta, Eggers & Katz, and Agarwal et. al.) and developing the basic design of a distributed shared memory system that can take advantage of this knowledge (Munin -- if you want to know the etymology of the name, you'll have to ask :-). If anybody is interested, we should have a few tech reports available soon (next week). One will describe our work on characterizing shared memory access patterns and the other will discuss the preliminary design of Munin. Hope you found this interesting. >Roy Smith, Public Health Research Institute John Carter Internet: retrac@rice.edu Dept of Computer Science UUCP: {internet node or backbone}!rice!retrac Rice University Houston, TX "Badgers?! We don't need no stinking badgers!" - from UHF