Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!watmath!clyde!burl!ulysses!allegra!mit-eddie!think!harvard!seismo!umcp-cs!chris From: chris@umcp-cs.UUCP (Chris Torek) Newsgroups: net.unix-wizards Subject: Re: Any decent Fortrans under Unix ? Which machine ? Message-ID: <255@umcp-cs.UUCP> Date: Thu, 13-Mar-86 17:29:34 EST Article-I.D.: umcp-cs.255 Posted: Thu Mar 13 17:29:34 1986 Date-Received: Sat, 15-Mar-86 19:57:33 EST References: <173@cybavax.UUCP> <206@goanna.OZ> <208@goanna.OZ> Distribution: net Organization: U of Maryland, Computer Science Dept., College Park, MD Lines: 112 [I have removed net.lang.f77 from the newsgroups list since this is no longer directly related to Fortran.] In article <208@goanna.OZ> rcodi@goanna.UUCP writes: >It would be nice if {ccom, f77pass1, pc0} all skipped the phases of >generating assembly language text, and just produced a load-file >straight from the source. In terms of speed, at any rate, it would be nice. But I am one of those strange people who on occasion likes to inspect the output of the compiler, both before and after peephole optimisation. >Poor pc users are stuck with the following [sequence]: cpp, pc0, >pc1 (f1), pc2, c2, pc3, as, ld. The Pascal compiler does not use the C preprocessor. The rest of the chain is accurate. >Even nicer if cpp was built into the input parser of ... the >compilers (but still have it as a separate package too). And therein lies a problem. This kind of thing can get to be a maintenance nightmare. (It *can* be done, and it is quite arguably worthwhile for something as heavily used as a compiler on a development system. But *I* do not want to do it.) >I tend to disagree with the philosophy of just using registers R0 and >R1 for calculations, and assigning the rest for register variables. *That* sounds like VMS. The Vax Unix compilers uses r0-r5 as scratch (for the perhaps peculiar reason that certain Vax instructions clobber these registers). >ALL intermediate results in calculations should be cached in >registers. If you want a good optimising compiler, any expressions that are reused should be so cached. But saving all intermediate results is not always optimal. >I have looked at a lot of assembly code produced by various ports >of UNIX C compilers and noted that there is little or no attempt >to "carry over" intermediate results from C statement to C statement. >The same goes for f77 and pc. In C, this is often not a problem. In the other languages, it is; this is why the 4.3 f77 has a huge front end optimiser. >What does YOUR compiler generate for the following C code? > a = z[i,j,k]; > b = z[i,j,k]; Given the following declarations: int z[N]; f() { register int i, j, k, a, b; ... it gives movl _z[r9],r8 movl _z[r9],r7 Now, while the second statement might be a bit faster if it used `movl r8,r7', it probably makes no noticeable difference. But I suspect you meant a = z[i][j][k]; b = z[i][j][k]; which of course requires much more work. But as an `experienced' C programmer, I would write a = b = z[i][j][k]; which (given `int z[2][3][4]') generates mull3 $48,r11,r0 addl2 $_z,r0 ashl $4,r10,r1 addl2 r1,r0 ashl $2,r9,r1 addl2 r1,r0 movl (r0),r7 movl r7,r8 which is not after all too awful. (Better might be # not sure if this 3 ins seq is faster than mull3 ashl $4,r11,r0 # r0 = x16 ashl $1,r0,r1 # r1 = x32 addl2 r1,r0 # r0 = x48 # this here is all the same as before addl2 $_z,r0 ashl $4,r10,r1 addl2 r1,r0 # this next one is the real `win', use fancy addr # modes to get *(r0 + 4*r9) movl r0[r9],r7 movl r7,r8 A true Vax assembly hacker may well know of even better tricks.) Anyway, the point of all this is that C often does not require an optimising compiler. Not that I have anything against them; indeed, I like optimising compilers, but I can make do without them. -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 1415) UUCP: seismo!umcp-cs!chris CSNet: chris@umcp-cs ARPA: chris@mimsy.umd.edu