Path: utzoo!mnetor!uunet!lll-winken!lll-tis!ames!umd5!purdue!i.cc.purdue.edu!j.cc.purdue.edu!pur-ee!hankd From: hankd@pur-ee.UUCP (Hank Dietz) Newsgroups: comp.arch Subject: Re: Wulf's WM Message-ID: <7992@pur-ee.UUCP> Date: 25 Apr 88 17:34:08 GMT References: <28200131@urbsdc> <1508@pt.cs.cmu.edu> <50669@sun.uucp> Organization: Purdue University Engineering Computer Network Lines: 41 Summary: CSPI also uses FIFO memory interface In article <50669@sun.uucp>, ram%shukra@Sun.COM (Renu Raman (Sun Microsystems)) writes: > In article <1508@pt.cs.cmu.edu> agn@UNH.CS.CMU.EDU (Andreas Nowatzyk) writes: > >Of particular appeal to me was the introduction of fifo's into the > >load/store instructions (decoupling the time when an address is issued > >from the time when the data is accessed) as it has the *potential* of > >allowing more latency in the memory system without degrading the throughput. > This is not new. Read about the ZS-1 in the prevous ASPLOS conference Actually, it has been common for a while now. I don't remember the model, but I know at least one of CSPI's array processors used FIFO interfaces not only to decouple memory references, but also to decouple interactions between control, address generation, and arithmetic hardware. You are right about the ZS-1... it looks very similar to WM. Besides, for the past 6 or 7 years, at least a few hardware people I know have used "dataflow microarchitecture" (i.e., FIFO-interconnected functional units) in building conventional-looking special-purpose machines. There is a catch, however, in that FIFOs to shared resources start showing the usual dataflow problems: operand to operation matching and high bus bandwidth requirements. These are non-trivial problems to solve dynamically. WM solves them by forcing operations of each type to be executed in the original sequence, although different types of operations can execute in a variable order relative to each other. This is quite a strong restriction. Personally, I'd rather see these problems solved by static scheduling at compile time: a la VLIW (but not a VLIW machine). Burton Smith has a design which uses static information to control dynamic out-of-sequence evaluation (he's been telling me about this for some years now, but he's building it, not publishing on it), and I've been involved in a couple of designs with similar properties (e.g., SBMs -- Static Barrier MIMDs -- papers/references available upon request). Statically scheduled, but not necessarily fixed sequence, machines seem to have the benefit of very simple hardware supporting a very general execution mechanism, and the compiler technology to get good results is easy enough (for us compiler gurus :-). __ /| _ | | __ / | Compiler-oriented / |--| | | | | Architecture / | | |__| |_/ Researcher from \__ | | | \ | Purdue \ | \ \ \ \ hankd@ee.ecn.purdue.edu