Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!think.com!mintaka!bloom-beacon!eru!hagbard!sunic!sics.se!fuug!news.funet.fi!tukki.jyu.fi!euler!tt From: tt@euler.jyu.fi (Tapani Tarvainen) Newsgroups: comp.sys.hp Subject: Re: 68040 and Floats, is this true? Summary: alas, it is true Message-ID: Date: 11 Jun 91 10:18:23 GMT References: <1991Jun07.213219.14174@lynx.CS.ORST.EDU> Sender: news@jyu.fi (News articles) Organization: University of Jyvaskyla Lines: 130 In-Reply-To: tt@euler.jyu.fi's message of Sun, 9 Jun 1991 09: 36:46 GMT Originator: tt@euler.jyu.fi Nntp-Posting-Host: euler.jyu.fi In article I wrote: >In article <1991Jun07.213219.14174@lynx.CS.ORST.EDU> curt@OCE.ORST.EDU (Curt Vandetta) writes: >> A couple of days ago, I read an article (Sorry I lost it) that someone >> here on the net wrote about thier experience with the 68040 upgrade on >> the HP 9000/400t. I currently have a 68040 upgrade kit sitting on my >> desking waiting for HP-UX 7.05. Is it true that the Floating Point >> performance suffers as much as the previous post indecated? I have a >> really uneasy feeling that it is true. >Floating Point performance suffers!? I'd say the question is how much >it improves ... our experience from the 400t -> 425t upgrade is that >floating-point intensive programs are speeded up by a factor ranging >from around two to almost seven. The original article referred to above arrived here today, and I must report that I got similar results: the '040 IS much slower with certain operations. In particular, *printf()ing floating point numbers is sloooow. I dug out HP-UX 7.05 Release Notes, which gives a list of operations the '040 can't do and which are therefore emulated in software. I've copied the relevant part here. (I guess this is technically copyrighted material, but I feel this is a justified copyright-slaughter if there ever was one.) ! Because there was not enough space on the chip, some instructions were ! chosen to be emulated in software. That is, instead of having the ! instruction interpreted by the hardware directly, a software trap is taken ! into the kernel, and software in the kernel does the requested operation. ! Because they are done in software, the algorithms used may be slightly ! different than the algorithms that would have been used on the 68882. ! Thus, there are differences in the results of the same instruction on the ! 68882 and 68040. ! ! Differing results are typically measured in "Unit Last Place's" (ULP's), ! which indicates the distance between the true mantissa and the one ! calculated. For example, if the real mantissa is 0x4572 and the ! calculated mantissa is 0x456E, the difference is 4 ULP's. ! ! The MC68882 documentation states that "in general, the worst-case accuracy ! of any transcendental function is one unit in the last place of double ! precision." The software that emulates these instructions is designed to ! give the same accuracy. This means that, on average, the double precision ! representation should be within one ULP of the true value. This does not ! mean that the 68882 and the 68040 give identical results, only that they ! both should be close to the desired value. ! ! Emulated Instructions ! --------------------- ! The instructions which are emulated in software are given below. ! Instructions marked with a (*) return exact results, the others are within ! one ULP in double precision. ! ! Instr. Description HP-UX Usage ! ------------------------------------------------------------- ! Trig Functions ! fcos Cosine libm, inline Fortran/C ! facos Arc Cosine libm, inline Fortran/C ! fsincos Sine and Cosine ! ftan Tangent libm, inline Fortran/C ! fsin Sine libm, inline Fortran/C ! fasin Arc Sine libm, inline Fortran/C ! fatan Arc Tangent libm, inline Fortran/C ! ! Hyperbolic Functions ! fsinh Hyperbolic Sine libm, inline Fortran/C ! fcosh Hyperbolic Cosine libm, inline Fortran/C ! ftanh Hyperbolic Tangent libm, inline Fortran/C ! fatanh Arc Hyper Tangent ! ! Exponential Functions ! flog2 Log base 2 ! flog10 Log base 10 libm, inline Fortran/C ! flogn Log base e libm, inline Fortran/C ! flognp1 Log base e of (x+1) ! ftwotox 2 to the x ! ftentox 10 to the x ! fetox e to the x libm, inline Fortran/C ! fetoxm1 e to the (x-1) ! ! Utility Functions ! fint Integer Part (*) Fortran Library ! fintrz Same, Round Zero (*) All Compiled Code using floats ! fgetexp Get Exponent (*) ! fgetman Get Mantissa (*) ! frem IEEE Remainder ! fscale Scale Exponent ! fmod Modulo Remainder Fortran Library ! ! ! Unsupported Data Types ! ---------------------- ! Besides the emulated instructions discussed above, the MC68040 does not ! have support for any kind of denormalized numbers on the chip. This ! included denormalized single and double precision numbers, as well as the ! less common denormalized extended precision. In order to handle these ! types, a software trap is taken into the kernel when these data types are ! encountered. ! ! A denormalized number is a smaller number than could normally be ! represented. These are included to extend the range around zero. Since ! they are minority, and since the data type handler can do exactly what the ! 68882 can do (that is, answers between the two chips should be the same), ! this should not cause any problems for most users. Because of the trap ! and emulate, dealing with denormalized numbers will be much slower than ! dealing with normalized numbers. ! ! Another data type which is not supported is packed decimal. Packed ! decimal is used to convert from binary floating point formats to the usual ! decimal form. This type is used by scanf() and printf() to input and ! output floating point numbers. Since the emulator uses the same algorithm ! that the 68882 used, the two chips should give the same result. Some comments: Cursory testing suggests that for the most part the emulation is quite effective. In particular, trigs and logs appear significantly faster on the 040 even though it's emulating them in software. The critical thing in the present case is, I think, revealed in the last paragraph I quoted above: packed decimal support. HP: PLEASE do something about this. If you can't speed up the packed decimal support emulation then try to rewrite *printf() and *scanf() without them. -- Tapani Tarvainen (tarvaine@jyu.fi, tarvainen@finjyu.bitnet)