Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!uwm.edu!ogicse!emory!hubcap!mark From: mark@hubcap.clemson.edu (Mark Smotherman) Newsgroups: comp.arch Subject: Re: RS/6000 cache Message-ID: <12962@hubcap.clemson.edu> Date: 6 Feb 91 21:51:20 GMT Organization: Clemson University, Clemson, SC Lines: 58 Thanks to those who emailed responses. I did not understand that the RS/6000 cache is a **hybrid** organization. That is, the set selection is done using bits from the virtual address, while memory transfers are done using bits from the physical address. The key concept that I was missing was that the tag is the full PFN -- and not merely the high PA (physical address) bits beyond the index and line offset (as in figure 8.8 on p. 411 in Hennessy and Patterson). Most overlapped lookup schemes for physically addressed caches only use the 12-bit page offset for the index and line offset fields (since these 12 bits are identical in both the VA and the PA). (See H&P p. 438.) 20 (or 30) bit VPN 12 bit offset VA [____________________|____________] >>>>>>>>>map<<<<<<<<< VVVVVVVVVVVV PA [____________________|____________] 20 bit PFN 12 bit offset / / subdivide / / PFN becomes tag / into index / / / and line / / / offset / However, the current models of the RS/6000 send the low 14 bits of the VA to the cache for lookup (the architecture reserves the right to send up to 20). Thus, the high two bits of the index are from the VA and may differ from the bottom two bits of the PFN. 14 bits to the cache (++ +++++ +++++++) VA [__________________.__|_____._______] 128 bytes/line => 7 bit offset / 7 bit /7 bit / 14-7=7 bit index => 128 sets / line /offset / 4 way set assoc => 4 lines/set / index /in line/ 128 sets * 4 lines/set * 128 bytes/line = 64K bytes Yet you keep the full 20 bit PFN as the tag (and not just the high 18 bits). PA [__________________.__|_____._______] / tag / Thus, on a cache miss you keep the 7 bit index for knowing which set will reload the line (which is obtained from memory using the PA). And on write back you generate the line PA by taking the 20 bit tag and appending the low 5 bits of the index (ignoring the top two bits of the index, which may differ). Somehow I didn't see that the tag overlaps the index. Very interesting approach to increasing cache size while retaining the advantages of a physically-addressed cache with overlapped lookup. -- Mark Smotherman, Comp. Sci. Dept., Clemson University, Clemson, SC 29634 INTERNET: mark@hubcap.clemson.edu UUCP: gatech!hubcap!mark