Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!mnetor!seismo!rutgers!clyde!cbatt!ihnp4!houxm!houxk!houxs!daw From: daw@houxs.UUCP (D.WOLVERTON) Newsgroups: comp.arch Subject: How does compiled code use the floating point unit? Message-ID: <394@houxs.UUCP> Date: Fri, 5-Dec-86 16:15:04 EST Article-I.D.: houxs.394 Posted: Fri Dec 5 16:15:04 1986 Date-Received: Sat, 6-Dec-86 18:13:38 EST Organization: AT&T Information Systems, Holmdel NJ Lines: 64 In some systems, the hardware floating point (fp) unit is _optional_. The Itty Bitty Machines (IBM) PC is a good example. From the point of view of a compiler writer, how does one deal with that uncertainty? [<--this one's a rhetorical question] I know of, or can imagine, several flavors of code generation in the face of this situation: 1) Code generation emits calls to a floating point library. This library checks for the presence of fp hardware, and uses the fp hardware it is is present, otherwise it emulates the operation. 2) Like (1), but the test for fp unit is made before the function call. The code is larger, but in the case where the fp unit is present it is faster because not function call was performed. 3) Code generation pretends that the fp unit will always be present, so it emits code which uses the fp unit directly into the instruction stream. If a fp unit is not present, the hardware arranges for a trap to occur which transfers control to the OS. At this point either: a) The OS recognizes that a fp operation was intended, and completes the operation by executing its own emulation code. Control is then transferred back to the user code. b) The OS recognizes that a fp operation was intended, and calls a special fp emulation entry point in the user code. When the function which emulates the fp operation is finished, it transfers control back to the user code. 4) Code generation emits code which always causes transfer to the OS, e.g. by illegal opcodes or TRAP instructions. The OS then proceeds like (3a) or (3b) above except that the fp unit may be used if present. I like (3) the best. In the case where a fp unit is present, the performance is no worse than if it was assumed that the fp unit would _always_ be present. If a fp unit is not present, the user's code will still execute, but more slowly. Furthermore, the user can upgrade his floating point performance by adding the fp unit, without re-compiling his code. (3a) has the slight additional advantage over (3b) that the user programs will be smaller because they do not have to carry the baggage of a fp emulation library. However, (3) also requires that the fp unit architecture is known a priori. It also does not account for a need to support more than one incompatible fp unit. Now the questions: Are there other scenarios in use? Anyone have a different choice for "best"? Why? Which is "best" if more than one fp unit must be supported, or if the architecture of the fp unit is not known a priori? =================================================================== David Wolverton ...!ihnp4!houxs!daw AT&T Information Systems, Holmdel