Path: utzoo!attcan!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!usc!rutgers!mcnc!rti!dg-rtp!siberia!hamilton From: hamilton@siberia.rtp.dg.com (Eric Hamilton) Newsgroups: comp.arch Subject: Re: [m88200] cache flushes [on DG Aviion] Message-ID: <1990Dec18.150615.26762@dg-rtp.dg.com> Date: 18 Dec 90 15:06:15 GMT References: Sender: usenet@dg-rtp.dg.com (Usenet Administration) Reply-To: hamilton@siberia.rtp.dg.com (Eric Hamilton) Organization: Data General Corporation, Research Triangle Park, NC Lines: 52 In article , rouellet@crhc.uiuc.edu (Roland G. Ouellette) writes: |> > The reason for requiring OS assistance/interference (pick one, according |> > to your prejudices) in user cache operations is multi-processor systems. |> > Direct hardware support for cache invalidate and writeback commands in |> > an MP system is possible, but constrains and complicates the hardware |> > design immensely. Think about how hardware might implement cache |> > writeback/invalidate operations in an MP system.... |> |> Actually it's not all that bad. REI on VAX (besides doing a ton of |> other stuff) flushes the icache. A separate write-back instruction |> cache makes hardly any sense at all (expecially when you talk about |> using the OS to flush things back to memory... the instruction stream |> seldom changes... and the architecture can require the user, |> compiler, whatever, to do something when it does). The icache might |> take invalidates from other processors, but require a cache flush |> instruction upon dynamicly generating code. The hardware support is |> about six transistors per cache line in the instruction cache which is |> used to clear the valid bit on the line. |> Some context was lost when this discussion outgrew comp.sys.m88k.... The question there was whether it makes sense to supply user-level non-privileged instructions that will copyback (a range of) the data cache and invalidate (a range of) the instruction cache. These operations are important to the folks doing incremental compilation, dynamic linking, garbage collection, planting breakpoints/watchpoints, and the like, especially in a multi-threaded and multi-processor environment. When the code stream changes, it's necessary to cause all data caches in an MP system to writeback, and then to invalidate all instruction caches (Harvard architecture, instruction caches don't snoop is a reasonable implementation choice); only after the data caches have completed their writebacks is it safe to allow any processor to start refilling its instruction cache. This is not easy to do in hardware, or at least not so easy that it should be done there without software involvement. Life is somewhat easier on VAX and similar proprietary CISC architectures, because there is much flexibility to move the implementation from hardware to microcode to an OS trap handler while preserving the user-level illusion of direct hardware support for the desired functionality. But even there, I would expect that many implementations would implement what appear to be user-level cache control operations by trapping to kernel software or microcode, which is not exactly direct hardware support. Surely the VAX REI instruction doesn't flush all instruction caches in a multi-processor? |> |> In an MP system, having a |> large coherent shared backup will make the cache refill penalty |> reasonably small. Absolutely right.