Path: utzoo!utgpu!jarvis.csri.toronto.edu!rutgers!gatech!ncsuvx!mcnc!duke!macbeth!ad From: ad@macbeth.cs.duke.edu (Apostolos Dollas) Newsgroups: comp.arch Subject: Re: Filling branch delay slot with test Message-ID: <15519@duke.cs.duke.edu> Date: 8 Sep 89 14:56:17 GMT Sender: news@duke.cs.duke.edu Distribution: na Lines: 36 From article <1437@atanasoff.cs.iastate.edu>, by hascall@atanasoff.cs.iastate.edu (John Hascall): > > No. What I was alluding to was "starting down both paths" of the > branch and then "dumping the loser". > > One (simple minded?) way of doing this would be to have two > parallel fetch&instruction-decoders in the pipe (perhaps with > suitable memory interleaving). > > T EXECUTE F&I-D 1 F&I-D 2 > I ------------- ------------- ------------- > M BEQL AGAIN TEST R0 -idle- > E TEST R0 JSUB BAR_RTN JSUB FOO_RTN > | JSUB ???_RTN ... ... > V > A great idea! In fact, we have been doing research in this area for the last several years. We have developed a working prototype system with multiple instruction decoding units (IDUs) and have filed for a patent. The system uses program flow graph information to prefetch instructions so that the IDUs are kept full at all times. This extra hardware enables *all* branch instructions to execute without any breaks (bubbles) in the instruction pipeline. As a result, the sustained performance of the execution unit is maximized. We are preparing an article for publication and will be glad to send copies to anyone interested. (We have not forgotten those of you who requested more information in April!) ========================================================================= Robert F. Krick PHONE: (919) 684-3123 x61 Dept. of Electrical Eng. CSNET: rfk@dukee Duke University UUCP: decvax!duke!dukee!rfk Durham, NC 27706 ARPA: rfk@dukee.egr.duke.edu