Path: utzoo!mnetor!uunet!husc6!bloom-beacon!gatech!hubcap!ncrcae!ncr-sd!crash!telesoft!roger From: roger@telesoft.UUCP (Roger Arnold @prodigal) Newsgroups: comp.arch Subject: conditional branches Message-ID: <191@telesoft.UUCP> Date: 11 Feb 88 01:06:57 GMT Organization: TeleSoft Inc., San Diego, CA Lines: 49 Keywords: frequency statement, flag registers, invariant expressions There's been a lot of discussion lately on the utility of FREQUENCY statements and the subject of branch prediction, etc. But nobody has been addressing an issue that looks (to me) to offer more potential for improving performance. A significant fraction of real code always seems to be devoted to repeating tests whose outcome is "already known". I'm thinking of things like tests on input parameters, when the tests occur multiple times within the same subprogram activation. There may be a standard technical term for what I'm talking about, but I'm not aware of it. "Loop invariant conditional expressions" are a major subcategory, but the category I want includes any test whose outcome can be determined by another test or tests preceding it in the data flow--recognizing that "another test" could include a prior execution of the same test. It's easy for an optimizer that does global data flow analysis to recognize such tests, but no architecture that I know of gives it anything very interesting to do with them. Sure, for a complex loop invariant test, the optimizer can move the test into the loop pre- header, and substitute a simple test on a boolean varible inside the loop. Big deal. The test on the variable and the conditional branch still end up sitting in the loop, taking up code space and execution cycles. Or the optimizer can get wild, and set up one copy of the loop as the THEN part of an IF THEN ELSE, and another as the ELSE part. That way the invariant tests, and any code that becomes dead as a result of the test hoisting, are eliminated from the loops. Execution efficiency is optimized, but the high cost in code space makes it practical only for small loops. What I'd like to see in an architecture, as a minimum, would be a set of one bit registers into which boolean values can be written. It's silly to have to tie up an entire 32-bit register to hold a boolean value. Especially since the register will still have to be tested to make its value accessible to the branch unit. With addressable flag registers, conditional branch instructions could directly reference the flag of interest, with no preceding test instruction. The flag registers could be available to an early stage of the instruction decode pipe, so that the branch instructions referencing them would be promptly converted to no-ops or unconditional branches, as appropriate. Five of the flag registers could be reserved for the standard ALU flags (zero, negative, positive, carry, and overflow), so that only one general form of conditional branch would be needed. Now, who's going to tell me that I've just described their favorite machine from out of the past? Surely SOMEONE has used this idea before? - Roger Arnold ..sdcsvax!telesoft!roger