Path: utzoo!attcan!uunet!ncrlnk!ncr-sd!hp-sdd!hplabs!amdcad!crackle!tim From: tim@crackle.amd.com (Tim Olson) Newsgroups: comp.arch Subject: Re: Register Windows (was Re: Japanese...) Message-ID: <23155@amdcad.AMD.COM> Date: 7 Oct 88 16:46:48 GMT References: <58@zeno.MN.ORG> <91@zeno.MN.ORG> <9410@pur-ee.UUCP> Sender: news@amdcad.AMD.COM Reply-To: tim@crackle.amd.com (Tim Olson) Organization: Advanced Micro Devices, Inc. Sunnyvale CA Lines: 46 Summary: Expires: Sender: Followup-To: In article <9410@pur-ee.UUCP> hankd@pur-ee.UUCP (Hank Dietz) writes: | Let's say you have 8 non-lazy windows (and one heck of a lot of valuable die | space consumed by them). What do you do when the 9th nested call is made? | The 10th? You do a sit-and-wait-for-it burst store, that's what... would | you really describe that as being "*without any* memory references at all!" True, but then non-lazy stores are only loading or storing what is absolutely required, when it is required. Lazy operations are continually trying to load/store ahead. This seems like the real misnomer -- non-lazy windows are truely lazy (only doing what is required) vs "lazy windows" (which are quite active). Consider an "on-demand" load/store window scheme with 4 windows, vs. a "background" load/store window scheme with 2 windows (which was implied to be all that was required). If the call chain looks like 1 2 3 4 5 6 7 6 7 6 7 6 7 6 7 8 7 6 5 6 i.e. spends a lot of around a local maximum depth (certainly not atypical). The "on-demand" window scheme has no saving or restoring to do while bouncing around between levels 7 and 6 because of the built-in hysterisis. The "background" load/store scheme, however, is continually saving and restoring. This is a waste of memory bandwidth. What register windows are buying is this hysterisis in saving and restoring the stack frame, and the only way to get it is to provide a large number of windows. Once that is available, background load/stores don't buy much, because register file spilling/filling just doesn't occur that often in real programs (maybe 0.5% of all calls spill) One other problem with "background" load/stores occurs when memory operations take more than a single cycle. In this case, a "background" memory operation may be started when the memory was otherwise idle, and right after that, a regular load or store is requested. The requested operation must wait for the background one to complete, decreasing performance (this is what I think Brian Case was talking about when he mentioned having to predict future operations -- to ensure that this kind of collision doesn't occur). Finally, if background loads/stores are interspersed with the regular load/store stream, they cannot take full advantage of their sequential nature, and thus cannot take full advantage of any faster burst-mode capability that memory may provide. -- Tim Olson Advanced Micro Devices (tim@crackle.amd.com)