Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!sun-barr!apple!mips!sgi!bron@bronze.wpd.sgi.com From: bron@bronze.wpd.sgi.com (Bron Campbell Nelson) Newsgroups: comp.sys.sgi Subject: Re: f77 performance under 3.3.1 Summary: f.p. exceptions Message-ID: <73658@sgi.sgi.com> Date: 30 Oct 90 18:05:41 GMT References: <90Oct29.145239est.57616@ugw.utcs.utoronto.ca> Sender: guest@sgi.sgi.com Organization: Silicon Graphics, Inc., Mountain View, CA Lines: 38 In article <90Oct29.145239est.57616@ugw.utcs.utoronto.ca>, WJP@VM.NRC.CA (Wayne Podaima) writes: > > We have some unhappy observations re performance of Fortran programs under > IRIX 3.3.1. The same program compiled under 3.3.1 runs about 5% slower > than it did under 3.2.3 (same compiler options), using 1 cpu of a 4D/240S. > The program is compute intensive - no paging, no disk I/O, double precision. > Any ideas; anyone with similar results? > > We then tried the libfpe.a floating point exception handling library, > expecting it to be still slower; well it ran 2% FASTER than the original > 3.2.3 compiled version. What gives? > I don't know about the first part, but I can comment on the second: in all likelyhood your program is getting a number of underflow exceptions (IEEE denormalized numbers). Denorms are *not* handled by the f.p. hardware, instead they are punted to an exception handler (which does the f.p. operation in software). In libfpe.a on the other hand, the default setting for underflow is to flush the value to zero, rather than to compute the denorm. Note that this setting is *not* IEEE compliant. The advantage is that zero is a value that the hardware can handle. You still take an exception when the value is first computed, but by flushing it to zero, you do not take further exceptions when that value is subsequently used. (Hummm ... I see someone might read that wrong .. the f.p. underflow "exception" is *not* an error, it is just an exceptional (unusual) event. Therefore it is dealt with by an exception handler, rather than by throwing hardware at the problem.) The numbers you report above seem to indicate you were spending 10%+ of your time dealing with denorm values, so you must have been getting a *lot* of them. -- Bron Campbell Nelson bron@sgi.com or possibly ..!ames!sgi!bron These statements are my own, not those of Silicon Graphics.