Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!usc!elroy.jpl.nasa.gov!decwrl!deccrl!bloom-beacon!eru!hagbard!sunic!mcsun!hp4nl!phigate!philica!adrie From: adrie@philica.ica.philips.nl (Adrie Koolen) Newsgroups: comp.os.minix Subject: Re: Floating point support in GCC Message-ID: <795@philica.ica.philips.nl> Date: 19 Apr 91 12:33:19 GMT References: <280C657A.14449@maccs.dcss.mcmaster.ca> Reply-To: adrie@beitel.ica.philips.nl (Adrie Koolen) Organization: Philips TDS, Innovation Centre Aachen Lines: 54 In article <280C657A.14449@maccs.dcss.mcmaster.ca> mc2@maccs.dcss.mcmaster.ca (Dan McCrackin) writes: > I was all set to get out my proverbial (software) hacksaw and pliers >to get my coprocessor going, when I realized that there is, alas, a major >problem with using a '387 (or '87 or '287 for that matter) under Minix. At >present the kernel doesn't save the context of a coprocessor during a >context switch. This is fine if you are only running one task that uses the >coprocessor, but would produce a fearsome mess with more than one task. :-( > >There are three possible solutions > (1) only run one fp task (simple, but impractical) > (2) modify mpx.x (the save and _restart sections) to always > save / restore the coprocessor context > (3) use the MP and TS flags of CR0 (on the 80386) to implement > saving / restoring the coprocessor's context only as required. > > The problem with (3) is that while it is the most efficient approach >(save only when absolutely needed), it would (IMHO) complicate the kernel >a fair bit. When I ported Minix to the Sun SparcStation, I didn't support the FPU at first. However, every SparcStation has an FPU which is quite fast (9 processor cycles to multiply two doubles!), so I added FPU support to the kernel. Initially, the FPU is disabled for all user processes. When a user process tries to execute a floating point instruction, it is trapped. The trap handler marks the process as a FPU user and enables the FPU for this process. From then on, the (32) FPU registers are saved and restored at task switches (yes, also if there's only one process using the FPU). It will take some time, but I can (i.e. have to) live with that. The overhead isn't that bad. The changes I made in the kernel were quite small. I guess that the FPU changes for a 387, which has exception handling comparably with the Sparc FPU, will also be quite simple. > The problem with (2) is (according to ye olde Intel manual) >an FSAVE takes > 100 clock cycles and dumps 90-some-odd bytes from the >coprocessor. FRSTOR has similar characteristics. The $64K question is >how much impact would this have on system performance? At least the >method of (2) is generally applicable to 8087's through 80386's. There will certainly be some (substantial) overhead, but I think that it won't be much more than 1% (rough estimate: if you `lose' 25us per task switch saving/restoring FPU registers and you get 100 switches per second, you lose 0.25% CPU performance). Compare that with the speed you gain by using your 387 and you'll agree that it's worth the trouble. As a bonus, you won't lose any performance when no processes use the FPU! Adrie Koolen (adrie@ica.philips.nl) Philips Innovation Centre Aachen PS. With Minix-Sparc and a brute-force mandelbrot program, I can generate full screen (1152*900) parts of the mandelbrot set within ONE minute!