Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!ames!xanth!mcnc!ece-csc!ncrcae!hubcap!mark From: mark@hubcap.clemson.edu (Mark Smotherman) Newsgroups: comp.arch Subject: Re: MicroVAX emulation (really : DEC about-face) Summary: Clark & Strecker article, INDEX inst. example, HPS impl. of VAX Keywords: RISC, CISC, HPS Message-ID: <5064@hubcap.clemson.edu> Date: 11 Apr 89 14:43:10 GMT References: <807@microsoft.UUCP> <92634@sun.uucp> <13322@steinmetz.ge.com> <573@loligo.cc.fsu.edu> Organization: Clemson University, Clemson, SC Lines: 44 In article <573@loligo.cc.fsu.edu>, bauer@loligo.cc.fsu.edu (Jeff Bauer) writes: > Boy, all things do come around again...and again. > I have a copy of a paper from grad school days by Clark and Strecker of DEC Douglas Clark and William Strecker, "Comments on 'The Case for the Reduced Instruction Set Computer,' by Patterson and Ditzel," Computer Architecture News, vol. 8, no. 6, October 15, 1980, pp. 34-38. I've always wondered why they seem to take a swipe at their own designers when, in discussing why the INDEX function was faster on the 780 if implemented as a sequence of simple instructions, they say: "Anecdotal accounts of irrational implementations are certainly ^^^^^^^^^^ (my emphasis) interesting. Is it *typical*, however, that composite instructions run more slowly than equivalent sequences of simple instructions? The paper reports that a sequence of several simple instructions can replace the VAX INDEX instruction with a 45% speed gain on the 780. This is a problem of implementation, not architecture. Fundamentally, after all, the implementation of the INDEX *function* with more than one instruction simply cannot take less time than the one-instruction version, assuming equal hardware in both cases. The explanation of this anomaly is that the 780's Floating Point Accelerator speeds up the multiply in the multi-instruction implementation, but doesn't see the INDEX at all." This is interesting to reread after the series of email articles discussing how hard it is to pipeline the VAX architecture. I've heard that the real win on VAX implementations is to put in a heavy-duty microcode pipe. Also, does anyone know if DEC is working on an HPS (i.e. a.k.a. micro- dataflow, restricted dataflow, decoupled VLIW) version of the VAX? Yale Patt reported work on this in the 1986 Microprogramming conference. Yale Patt, *et al.*, "Run-Time Generation of HPS Microinstructions from a VAX Instruction Stream," in Proc. MICRO 19, New York, Oct. 1986, pp. 75-81. (and I think a paper in MICRO-20 also) Has DEC followed up this work? -- Mark Smotherman, Comp. Sci. Dept., Clemson University, Clemson, SC 29634 INTERNET: mark@hubcap.clemson.edu UUCP: gatech!hubcap!mark