Path: utzoo!attcan!uunet!cs.utexas.edu!usc!julius.cs.uiuc.edu!ux1.cso.uiuc.edu!ux1.cso.uiuc.edu!aglew From: aglew@crhc.uiuc.edu (Andy Glew) Newsgroups: comp.arch Subject: Re: Cache Line Fills -- Critical Word First Message-ID: Date: 3 Oct 90 17:43:32 GMT References: <34275@cup.portal.com> <14780@cbmvax.commodore.com> <41856@mips.mips.COM> <1990Oct3.140725.3931@mozart.amd.com> Sender: news@ux1.cso.uiuc.edu (News) Organization: Center for Reliable and High-Performance Computing University of Illinois at Urbana Champaign Lines: 46 In-Reply-To: tim@proton.amd.com's message of 3 Oct 90 14:07:25 GMT ..> Critical word first, ..> and starting the processors as soon as the missing instruction is fetched. >In article <41856@mips.mips.COM> mash@mips.COM (John Mashey) writes: >| For the I-cache, when you have a cache miss: >| I1) You can stall the machine, fetch the entire cache block, >| then restart. This is clearly the simplest. >| I2) You can do "early restart", where you begin executing as soon >| as the requested word is available. This is sometimes called >| "Instruction Streaming" (in the MIPS R3000), i.e., when you >| cache miss: >| start fetching at word 0 of the block >| stall until the needed word is fetched, then stream >| if you branch elsewhere before the end of the block, >| stop streaming, stall the pipeline until block filled >| also, if a load/store causes a cache miss, complete >| the I-cache refill, then handle the D-cache miss >| I3) You can do "out-of-order fetch" in addition to early restart, >| and then do "wrapped fetch", so that you wrap-around to complete. > >[tim@proton.amd.com (Tim Olson)]: >There are also other possibilities, such as: > > I4) Have a valid bit per word in the cache block, and fetch > the missed instruction first, then burst reload continuing > from that instruction into subsequent blocks, rather than > wrapping around to complete the missed block. > >This tends to match instruction fetch patterns better than the other >solutions, but with the added expense of extra valid bits and more >complexity. One of the advantages of the IBM RS/6000's Cache Reload Buffers seems to be that they function as an extra cache line that has a valid bit on every bus width. When the burst is finished, the CRB is transferred to a regular cache line that has fewer valid bits.[*] [*] Actually, I'm not so sure if the RS/6000 has fewer valid bits in the normal cache line - this might be one of my extrapolations. -- Andy Glew, a-glew@uiuc.edu [get ph nameserver from uxc.cso.uiuc.edu:net/qi]