Path: utzoo!utgpu!watmath!att!ucbvax!tut.cis.ohio-state.edu!pt.cs.cmu.edu!andrew.cmu.edu!+ From: wilken@husky.ece.cmu.edu (Kent Wilken) Newsgroups: comp.arch Subject: Re: delayed branch Message-ID: <8908071844.AA03398@husky.ece.cmu.edu> Date: 7 Aug 89 18:44:44 GMT References: <2246@taux01.UUCP> <1462@l.cc.purdue.edu> <26139@shemp.CS.UCLA.EDU> <7543@cbmvax.UUCP> <1989Aug6.164210.11976@jarvis.csri.toronto.edu> Reply-To: wilken@husky.ece.cmu.edu (Kent Wilken) Organization: Carnegie Mellon, Pittsburgh, PA Lines: 23 In article <1989Aug6.164210.11976@jarvis.csri.toronto.edu> jonah@db.toronto.edu (Jeffrey Lee) writes: >... Having conditional delay slots for conditional branches takes >some of the sting out of this. ... >The delay action can be made ``conditional'' by adding silicon to >nullify the action (e.g. prevent writeback of the results to the >register cells or abort a load/store instruction) in the case of the >branch not taken (or vice versa). ... >Which choice to implement depends on the expected probability of >branch taken versus not taken. [Sorry, I have no data on this.] ... The paper "Reducing the Cost of Branches" by McFarling and Hennessy (13th Comp. Arch. Conf. pp 396-403) addresses this topic. Data is presented that shows the branch-taken probability to be roughly 65%. The paper shows that the nullify action, called "squashing", can significantly reduce the cost of branches for a profiled program. ("Cost" is a performance merit that includes the performance cost of NOPs). A following paper "Architectural Tradeoffs in the Design of the MIPS-X" by Chow and Horowitz (14th Comp. Arch. Conf. pp 300-308) shows that squashing melds nicely with exception handling (i.e., the amount of added silicon is small). They also report that the MIPS-X, which has two branch delay slots, uses from 16-18% NOPs. Kent Wilken