Path: utzoo!dptcdc!jarvis.csri.toronto.edu!mailrus!uflorida!novavax!hcx1!hcx2!bill From: bill@hcx2.SSD.HARRIS.COM Newsgroups: comp.lang.fortran Subject: Re: yes vs. no on f8x Message-ID: <44400036@hcx2> Date: 17 Apr 89 17:31:00 GMT References: <24130@beta.lanl.gov> Lines: 72 Nf-ID: #R:beta.lanl.gov:24130:hcx2:44400036:000:4203 Nf-From: hcx2.SSD.HARRIS.COM!bill Apr 17 13:31:00 1989 > It is precisely this hard information on what the new standard will > require compilers to do that is missing in this newsgroups discussions. > Are there any compiler writers out there that want to enlighten us? I would love to. I have been trying to do so for the past year. Either I have failed miserably, or you haven't been reading this newsgroup that long. :-) Unfortunately, most of my attempts to "enlighten" people about what FORTRAN/8x will do to compilers has been "misinterpreted", perhaps partly because I work for a (dare I say it?), a VENDOR! And, as everyone knows, vendors are always against progress and innovation and children and Mom and apple pie and patriotism and anything else that's good and decent. Anyway, I think your example of the array notation is an excellent place to start. Probably, users of supercomputers and minisupercomputers will see very little difference between using the array notation or using DO loops. Perhaps their compiles will go faster, as the compiler has less work to do, but they probably already go so fast as to appear instantaneous. However, users of scalar machines, particularly small scalar machines, will see a HUGE difference. Their compiles will probably go much slower, or their code will execute much slower, or both. Now, before we go any further, I have made an assumption: that both sets of users are dealing with code that has been hand-optimized as much as possible, that the best algorithms have been used, etc., and then those algorithms converted from DO loops to the 8x array notation. First, let's consider a relatively simple compiler, the kind you would probably find on a PC or small workstation. That compiler will probably transform each array statement into one, possibly two, loops. (I say two because the 8x rules say that, in effect, the entire right-hand side of the assignment is evaluated prior to altering the left-hand side. Only in very simple cases can a compiler trivially detect that this can be avoided without harm; the other cases require more analysis than our example compiler would typically want to do, given the horsepower and memory available.) Now if your original DO loop was only doing that one statement, then you have lost nothing (yet). But, judging from code I have seen from customers, that is seldom true -- users typically put multiple, independent operations in one DO loop to save on the loop overhead. In that case, you have traded one loop for n loops. Furthermore, any temporary arrays required to evaluate the right-hand side of the assignment will probably be allocated dynamically, since seldom will the limits of the arrays involved be static. That dynamic allocation costs you, the user, in execution speed; and, there will probably be more temporaries allocated than you perhaps think, again because the analysis required to avoid it is just too expensive. Now let's consider a somewhat smarter compiler, one that attempts to do some optimization, but still far short of attempting the analysis required by a vectorizer. This is typically the type of compiler you would see on a larger scalar machine, like a mini or supermini. This compiler probably will do better at detecting when the left and right-hand sides of an assignment don't overlap, and when it can avoid allocating a temporary, but it won't be perfect. It might also do some loop unrolling of those loops generated by the array statements, which will partially offset the penalty of having traded one loop for many. However, you'll still probably pay an execution penalty, because that one original DO loop might easily fit in your machine's instruction cache, but all those unrolled loops quite probably won't. You'll pay for this in more memory traffic and slower execution speed. But you'll also pay for the increased analysis required of the compiler in slower compilation speed. There are many other examples of how the array notation will adversely impact users of scalar machines. I hope this has been, at least, educational to you. Bill Leonard Harris Computer Systems Division 2101 W. Cypress Creek Road Fort Lauderdale, FL 33309 bill@ssd.harris.com or hcx1!bill@uunet.uu.net