Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!usc!zaphod.mps.ohio-state.edu!mips!winchester!mash From: mash@mips.COM (John Mashey) Newsgroups: comp.arch Subject: Re: 386 Clones [really: IEEE floating point & various approaches; long] Message-ID: <42597@mips.mips.COM> Date: 1 Nov 90 02:56:12 GMT References: <1990Oct26.015244.586@amd.com> <8464@scolex.sco.COM> <2816@crdos1.crd.ge.COM> <2451@charon.cwi.nl> Sender: news@mips.COM Reply-To: mash@mips.COM (John Mashey) Organization: MIPS Computer Systems, Inc. Lines: 132 In article <2451@charon.cwi.nl> dik@cwi.nl (Dik T. Winter) writes: ... >The true Cray and Convex users do not care about IEEE. Their machines do >not have IEEE conformant arithmetic; far from it. They want results; fast. >They do not care about correctness. :-) Note, also, that this somewhat applies to the IBM RS/6000 series. Legally, but unlike almost all other IEEE FP implementations, FP OPERATIONS DO NOT NORMALLY TRAP ON EXCEPTION CONDITIONS; i.e., SIGFPE doesn't normally do anything. You choices are: a) Run the code in a mode where FP operations are sequentialized, which of course seriously degrades performance, but gives precise exceptions. (good for debug) b) Make explicit calls in your code to routines that test the status of the various flags. (probably not favored by programmers, especially since it de-portabilizes otherwise-portable code.) c) (I'm not sure if this is shipped yet, or not; does anybody KNOW?): use compiler switches that generate the relevant tests, like at end of statement, end of function, or program exit. There are some reasonable reasons from this, and I attach the relevant quotes from the IBM J of R & D that describes the RS/6000. HOWEVER, people may want to be warned that code ported from other machines, may possibly NOT trap exceptions the way that most other IEEE machines do. (I would assume that if you trap SIGFPE, that the runitme at least, on program exit, tells you that there has been an exception "sometime". If it doesn't do this, I suspect there is a bunch of code out there that thought it was protecting itself, and didn't.) Here's what the IBM man page says: fp_clr_flag, fp_set_flag, fp_read_flag, or fp_swap_flag ..... Description The RISC System/6000 currently does not generate an interrupt for floating-point exceptions. Therefore, the common method of catching the signal SIGFPE and calling an appropriate trap handler to identify a floating-point exception is not supported. These subroutines aid in determining when an exception has occurred and the exception type. These subroutines can be called explicitly around blocks of code that may cause a floating-point exception. Then, here is some explanation (long): From IBM J Res & Dev, Vol 34, No 1, January 1990 (IBM RS/6000 issue) p.33-34 (CAPITALS MINE) "Another very important aspect of fully exploiting floating-point performance is the method of presentation of floating-point exceptions and the precision in identifying the instructions that cause floating-point exceptions. Exceptions are a natural and perhaps expected consequence of floating-point operaitons, and most can be handled by default rules. (Default exception handling is defined by the IEEE standard.) These default rules can be managed completely in hardware and require no program intervention after initialization. The IEEE default rules do not always provide the desired result, however. Since the standard allows for a program fix-up after an exception, the architecture problem is then to define a mechanism to permit program fixu-up. The most straightforward approach is to specify that a floating-point interrupt at the failing instruction will occur whenever there is a floating-point exception that is not defaulted. The hardware implication of this is that all instructions after a floating point instruction must be conditional until it is know that no exceptions are possible on that instruction. Some floating point instructions take many cycles, and exceptions may not be known until the last cycle of the instruction. Therefore, most implementations would serialize on floating-point instructions-if not the first, then the second; if not all, then some. The inclusion of a floating- point interrupt would sacrifice much of the potential floating-point performance. AN ALTERNATIVE STRATEGY IS NOT TO REPORT AN INTERRUPT AT ALL, BUT SIMPLY TO SET A BIT INDICATING THAT A FLOATING-POINT EXCEPTION HAS OCCURRED. IT IS THEN UP TO A PROGRAM TO TEST FOR FLOATING-POINT EXCEPTIONS. Different compiler strategies can be used as to where it is appropriate to test for these exceptions. Since the definition of the exception also includes the setting of summary information, it is possible to test at the end of a program, at the end of a subprogram, or at the end of a statement where a floating-point operation was used. This level of precision can be controlled by linker/compiler option. None of these tell exactly where the exception occurred; they simply identify where it occurred. In most cases, this information is sufficient. However, if the exact failing instruction must be known, there are two possible strategies. One can insert a test for the exception after each floating-point instruction, or one can tag each queued and/or executing floating-point instruction with its address. Inserting code to test for every possible exception is yet another mode for the compiler to manage, necessitates recompilation, and can significantly expand execution time. Address tagging of "active" floating-point instructions identifies the failing instruction exactly. However, it does require that the implementation keep track of the address tags. Moreover, it is not synchronous; that is, if an exception occurs, the location of the failing instruction is reported, but not before the program has gone beyond that point. Fix-up may still be possible, but in general this method only permits localization of the failing instruction. Consider the case ofthe inner-loop product described in Figure 7. This loop consists of two floating-point loads, one floating-point multiply-add, and one branch. The "active" floating-point instructions will all be instances of the same multiply-add instruction. If an exception occurs, what is know is the address of the instruction, not the iteration number. The benefit of this approach is speed; floating-point performance is not limited by exception recognition. The drawback, as outline above, is the precision with which the fault is determined. RISC System/6000 architecture adopted a two-part strategy. THE PRINCIPAL APPROACH WOULD BE TEST-CODE INSERTION, with the compilers able to insert such code at the statement or (sub)program level. The linker also supports the enabling of test code at program exit, ensuring the ability to report a floating-point exception if it occurs anywhere within the program. To avoid recompilation in order to identify the failing operation exactly, the architecture also adopted a synchronize mode, in which an interrupt can be generated, identifying the failing instruction by running the machine with one floating-point instruction dispatched at a time. This technqiue has the same weakness as code insertion; THAT IS, FLOATING-POINT PERFORMANCE IS GREATLY REDUCED. However, it may not be as bad as code insertion, because the synchronization can be managed by hardware rather than by extra code inserted by software. It is expected that the mode will only by extra code inserted by software. IT IS EXPECTED THAT THE MODE WILL ONLY BE USED BY CERTAIN PROGRAMS AND THEN ONLY TO DEBUG THEIR ALGORITHMS." -- -john mashey DISCLAIMER: UUCP: mash@mips.com OR {ames,decwrl,prls,pyramid}!mips!mash DDD: 408-524-7015, 524-8253 or (main number) 408-720-1700 USPS: MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086