Xref: utzoo comp.arch:10558 comp.lang.misc:3057 Path: utzoo!utgpu!jarvis.csri.toronto.edu!rutgers!apple!ames!xanth!mcnc!decvax!ima!haddock!suitti From: suitti@haddock.ima.isc.com (Stephen Uitti) Newsgroups: comp.arch,comp.lang.misc Subject: Re: Programming and Machine Operations Message-ID: <13979@haddock.ima.isc.com> Date: 9 Jul 89 04:35:25 GMT References: <57125@linus.UUCP> <1989Jun24.230056.27774@utzoo.uucp> <13970@haddock.ima.isc.com> <1398@l.cc.purdue.edu> Reply-To: suitti@haddock.ima.isc.com (Stephen Uitti) Organization: Interactive Systems, Boston Lines: 75 In various articles, cik & suitti argue... >>suitti >cik >> The code sequence is: ... >> >> void lvecadd(a, b, c, s) /* a = b + c, length s */ >> long *a; long *b; long *c; long s; >> { >> do { >> *a++ = *b++ + *c++; >> } while (--s != 0); >> } For the VAX: L18:addl3 (r9)+,(r10)+,r0 movl r0,(r11)+ decl r8 jneq L18 > >I am afraid I will have to give you a D- on this. Most of the time I would >not even bother with a call, considering the code length. But the code is >bad for vectors of length >2. How about >{ > end = a + s; > do { > *a++ = *b++ + *c++; > } while (a < end); >} L26:addl3 (r9)+,(r10)+,r0 movl r0,(r11)+ cmpl r11,r7 jlss L26 These are the loops as coded by the VAX PCC based C compiler. I've no idea why it felt the movl was needed. In any case both code fragments compile to the same number of instructions. My decl has fewer arguments than your cmpl. Without running the test, there is no way to know which will run faster. It depends on which VAX. A microVAX might (and probably does) have different relative times than a 780. Still, i expected the compiler to use some sort of subtract one and branch if not zero instruction instead of a decrement, and a branch. Unfortunately, i didn't check my VAX architecture handbook in advance, and i'd forgotten (through disuse) that the VAX has an SOBGTR rather than an SOBNE. I should have used } while (s > 0); On a PDP-11, the "SOB" instruction uses a test for not equal to zero. Further, on a PDP-11 (and many other machines) an extra variable may strain the register resources of the machine requiring some high use variable be put into slower storage. A really good compiler should have been able to convert the end condition and use the right instruction. The VAX PCC based compiler is not a "really good compiler". Also, i was considering inlined code, even if originally presented as a function. If the vector lengths are large (and experience shows that they never are), it doesn't matter if it is a subroutine or not, just as the computation for 'end' in your example is immaterial. Yeah, i know, facts in an argument. Fortunately, i went to a school that didn't have letter grades. It was pass/fail. Many schools, including Purdue, often treat their grad students with some dignity and respect. I had the good fortune to find an undergrad school that did so. I never had to argue with instructors. It was always real clear if i'd passed or failed (did i do the work? yes: pass, no: fail). Thus, i never got a D- for code that worked. Once, however, in high school, i managed to get an A (best you could get), with a U (unsatisfactory) for effort. Stephen.