Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!mailrus!iuvax!purdue!bu-cs!buengc!bph From: bph@buengc.BU.EDU (Blair P. Houghton) Newsgroups: comp.lang.c Subject: Re: IEEE floating point format Message-ID: <3591@buengc.BU.EDU> Date: 3 Aug 89 17:02:12 GMT References: <2170002@hpldsla.HP.COM> <9697@alice.UUCP> <3554@buengc.BU.EDU> <9725@alice.UUCP> Reply-To: bph@buengc.bu.edu (Blair P. Houghton) Followup-To: comp.lang.c Organization: Boston Univ. Col. of Eng. Lines: 40 In article <9725@alice.UUCP> ark@alice.UUCP (Andrew Koenig) writes: >In article <3554@buengc.BU.EDU>, bph@buengc.BU.EDU (Blair P. Houghton) writes: > >> Fascinating; but, what does it mean to say "denormalized" in this context? > >the smallest positive number that can be >represented in IEEE 64-bit form without going into denormalized mode >is 2^-1022. That number is represented this way: > > 0 00000000001 0000000000000000000000000000000000000000000000000000 > >If you count the way I did in my last note, this means an exponent >of -1021 and a fraction of .(1)0000000.... > >The next smaller number is represented this way: > > 0 00000000000 1111111111111111111111111111111111111111111111111111 > >This is the largest denormalized number: its value is 2^-1021 times >.(0)11111... That is, the hidden bit becomes 0 when all the exponent >bits are 0. Thus it is possible to represent numbers that are too >small for the normal exponent range, albeit with reduced precision. Okay, so "normalization" refers to ensuring that the precision is 53 bits for any number with a nonzero exponent-field. Next question: do C compilers (math libraries, I expect I should mean) on IEEE-FP-implementing machines generally limit doubles to normalized numbers, or do they blithely allow precision to waft away in the name of a slight increase in the number-range? I expect the answer is "the compiler has nothing to do with it", so the next question would be, are there machines that don't permit the loss of precision without specific orders to do so? --Blair "Or Fortran compilers, but I don't need those, and this ain't the group for it, this being comp.lang.c.pointer.addition...."