Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!zaphod.mps.ohio-state.edu!samsung!uunet!lll-winken!sun-barr!newstop!sun!amdcad!dgcad!dg-rtp!siberia!hamilton From: hamilton@siberia.rtp.dg.com (Eric Hamilton) Newsgroups: comp.sys.m88k Subject: Re: Is the FPSR interlocked with the FPU pipe? Message-ID: <1991Apr25.143154.8469@dg-rtp.dg.com> Date: 25 Apr 91 14:31:54 GMT References: <1991Apr24.200412.7483@eagle.lerc.nasa.gov> Sender: hamilton@siberia (Eric Hamilton) Distribution: na Organization: Data General Corporation, Research Triangle Park, NC Lines: 36 In article <1991Apr24.200412.7483@eagle.lerc.nasa.gov>, fsset@bach.lerc.nasa.gov (Scott E. Townsend) writes: |> This sounds too strange to be true, but is it possible for the FPSR to |> return 'too fresh' data? Or put another way, why should the following two |> code fragments behave differently? |> |> #1 |> fdiv.ddd r8,r2,r4 |> fldcr r12,fcr62 ; fcr62 == FPSR |> bb1 0,r12,@L21 ; AFINX bit |> |> #2 |> fdiv.ddd r8,r2,r4 |> tb1 0,r0,0 ; trap not taken, but system 'synced' |> fldcr r12,fcr62 |> bb1 0,r12,@L21 |> |> This code is buried a bit, so I don't have the exact different results, |> but the behaviour _is_ different. Is it possible the fldcr gets whatever |> is 'current' rather than the result of the fdiv? (which will take a while) Yes. These two code fragments behave differently, and for exactly the reason that you suspect. The fdiv instruction has started but not completed. The floating point imprecise exceptions (overflow, underflow, and imprecise) are signalled when the operation is complete, so code fragment #1 is reading the FPSR prematurely. The trap-not-taken drains teh pipelines (at the cost of waiting sixty-odd cycles for the fdiv to complete) so that code fragment #2 will show the effect of any imprecise exceptions provoked by the fdiv. Note that any attempt to use r8 or r9 will have the same effect of waiting for the fdiv to complete - there will be a scoreboard hold. The oddity is that the FPSR, which is also a "destination" register for the fdiv is not interlocked with the floating point pipe.