Xref: utzoo comp.lang.fortran:5104 comp.unix.cray:291 comp.sys.super:310 Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!zaphod.mps.ohio-state.edu!wuarchive!uwm.edu!convex.csd.uwm.edu!bruno From: bruno@convex.csd.uwm.edu (Bruno Wolff III) Newsgroups: comp.lang.fortran,comp.unix.cray,comp.sys.super Subject: Re: Fortran optimization Message-ID: <10775@uwm.edu> Date: 4 Apr 91 21:31:02 GMT References: <1991Apr3.062644.23436@eagle.lerc.nasa.gov> Sender: news@uwm.edu Followup-To: comp.lang.fortran Organization: University of Wisconsin-Milwaukee Computing Services Division Lines: 15 In article <1991Apr3.062644.23436@eagle.lerc.nasa.gov> fsjohnv@alliant1.lerc.nasa.gov (Richard Mulac) writes: % % While optimizing a code last year I ran across a situation which on the %surface appears to contradict some of the basic rules one abides by when %trying to write efficient Fortran code. I was trying to merge several DO %loop blocks to eliminate the overhead associated with redundant index %calculations and also to reduce memory access. The code was written to %be run on a CRAY Y-MP, compiling with cft77 4.0.1.6. What I found can %be described using the following program TEST, which calls subroutines %JOINED and SPLIT (It's a few hundred lines long, but the majority of the %code is redundant). Possibly this is caused by the loop not fitting in the instruction cache. It can be more efficient to have several small loops each of which entirely fits in the instruction cache instead of a larger loop which doesn't.