Path: utzoo!yunexus!geac!syntron!jtsv16!uunet!lll-winken!lll-tis!ames!mailrus!cwjcc!tut.cis.ohio-state.edu!bloom-beacon!athena.mit.edu!scs From: scs@athena.mit.edu (Steve Summit) Newsgroups: comp.lang.c Subject: Re: Efficient coding considered harmful? Message-ID: <7700@bloom-beacon.MIT.EDU> Date: 28 Oct 88 10:28:49 GMT Article-I.D.: bloom-be.7700 References: <3105@hubcap.UUCP> <34112@XAIT.XEROX.COM> <1700@dataio.Data-IO.COM> <8630@smoke.ARPA> <1704@dataio.Data-IO.COM> <119@twwells.uu Sender: daemon@bloom-beacon.MIT.EDU Reply-To: scs@adam.pika.mit.edu (Steve Summit) Lines: 428 In article <119@twwells.uucp> bill@twwells.UUCP (T. William Wells) writes: >It's the damnedest thing, but people whose opinions I otherwise >respect seem to have this thing about coding efficiently. They don't >like it. Worse, they discourage it. Some time ago I came to the conclusion that the style vs. efficiency debate is another religious war, which I usually try to stay out of. Good style and, I'll admit it, anti-efficiency is a personal crusade, though, so I can't help but respond. None of this should be taken as a personal attack against Mr. Wells, or anyone else, for that matter. His points are well taken; I simply have a different point of view, going back to first principles, as all good religious wars do. When caught between a rock and a hard place, when something's got to give, some people shrug their shoulders and sacrifice style, modularity, or portability. I am prepared to sacrifice efficiency. (Most of the time I can have my cake and eat it too, and I have any number of general techniques for doing so, which I will probably defer to a separate posting so that those who are sensibly ignoring these diatribes can still see them.) >I advocate training oneself to be one who codes efficient programs. I >do not mean writing code and then making it more efficient, I mean >writing it that way in the first place. Here I agree with at least some of the words, but not the implications. It happens that I usually write nice, efficient programs, but not by resorting to microefficiency tweaks or by sacrificing readability in any way. It is important to employ lots of common sense (and, of course, "common sense isn't"); to shy away from grotesquely inefficient algorithms; and to partition the code so that key sections can be cleanly replaced later if efficiency does prove to be an issue. It isn't important to write every last expression and statement in the most theoretically blistering way possible. >I have heard three main counterarguments to this: >1) "The compiler ought to handle that, so you shouldn't bother with > it." What nonsense! We have to code in the real world, not the > world of make-believe and ought-to. Many of the more egregious examples of source-level microefficiency tweaking are, in fact, routinely handled by today's (and even yesterday's) compilers. Consider replacing power-of-two multiplies by left shifts, a textbook example. I just booted up my pdp11 (I am not making this up), and its optimizer knows how to shift left when I multiply by two. (To be sure, not all newer compilers have been better.) >3) "Efficient coding makes obscurer programs." Well, since most > efficient coding involves minor changes from the inefficient way > of doing things, changes which are mostly a matter of style rather > than of organization... I don't know what is meant by "most efficient coding." The interpretation "the most efficient coding involves the most minor changes" is probably not the one that was intended, although I like it, because taken to its extreme it says not to make any changes at all. I cannot agree with the more likely interpretation. The minor changes, the replacements of multiplies and divides by shifts and the like, are mostly noise (both with respect to efficiency and style). The really big efficiency hacks, the ones people spend weeks and months on, do involve massive organizational changes and wholesale concessions to readability and maintainability. > ...this argument should really be read: "*I* > don't like to or find it hard to read that kind of program, so it > is unclear." The attitude of most C experts is "*I* don't have any problem reading this code; anyone else who considers himself a Real C Programmer shouldn't, either." This attitude is patronizing and wrong. To borrow a previous argument, "We have to code in the real world, not the world of make-believe." There are plenty of people out there who are still learning C, and since C has become as popular as it has and C programmers are in demand, many of them are working for real companies writing (and maintaining) real programs. A few painless concessions (like multiplying instead of shifting, or making comparisons against NULL pointers explicit) really do substantially increase the readability of code by mere mortals. (I maintain that every unnecessary translation from an encoding back to an intended meaning, no matter how intuitive to the experienced programmer, incrementally increases comprehension time, effort required, and probability of error.) >...look for redundant calculations: sometimes it is >cost effective to store the result of a previous calculation instead >of recomputing it. ...but be scrupulously careful, and comment the stored result well, because doing so is one, extremely common, "concession to maintainability." One of the all-time, number one bugs is the redundant variable whose value becomes wrong because one of the precedent variables changes, but the calculation is not repeated. If you use n squared at several different places within a longish section of nontrivial control flow, and you try to replace it with a new variable nsq which is set to n*n just once, sooner or later someone (probably you) is going to make some change to the program (either a new assignment to n, or a new wrinkle in the control flow) which results in nsq not containing the square of the current n. If you leave the explicit n*n everywhere, this error is impossible. This advice, like almost anything, must be applied on a case-by-case basis. When the control flow is straight-through and all occurrences of the common subexpression are well- localized, a temporary variable is often a good idea, not so much because it's easier to type or it might be more efficient, but because it makes it easier for the reader to prove that the identical quantity was really used in both (or all three, etc.) places. >Use register variables. And take a litle pain to get them right. A >technique that works enough of the time is this:... I'm sorry, I don't have spare time or spare brain cells for following or utilizing a technique like this. When I was a beginning C programmer, I decided not to use register variables at all in the first pass of coding, because I figured that, when I eventually got the program working, somebody was going to say that it wasn't fast enough, and if I had already blown all of my ammunition, how was I going to speed it up? Lately I throw in "register" all the time, but in some ways I respect my original attitude more. The point is, if it's the sort of program that people notice how fast it is, they would have complained even if I had used register variables since day one. I have also always felt that this is one area where the compiler really should be doing the work. Many modern compilers are very, very good at register allocation. It is true that the historically popular compilers (specifically, the Unix ones) are not, so it is good practice to put "important stuff" in registers, but don't spend a lot of time on it unless you have to. (What's "important stuff?" A ridiculously simplistic rule of thumb for using "register" in C, albeit one of the two I use, is "on any variable named 'i' or 'p'.") >Avoid excess function calls. A lot of people have been brainwashed by >some modularity gurus into believing that every identifiable "whole" >should have its own function. What I have found is that this makes >the program LESS maintainable than otherwise. Here's another religious debate: the structured programming one. Nevertheless, good modularity is the only way to keep a large program maintainable. Of particular pertinence to the theme of this article is that efficiency-critical modules, if properly isolated, can be implemented quickly and (possibly) inefficiently first, while getting the program to work correctly at all, and then transparently replaced later with a more efficient version, if and only if it is found that the quick, inefficient implementation is really too inefficient after all. If people have a bad impression of highly modular code, it is because they (or, more likely, their predecessors) have used what I call a "Ginsu knife" approach to modular decomposition, whereas the correct instrument to use is a scalpel. If you went to a lecture once on modularity but the only guideline you remember is "50 source lines," you're going to use split -50 on your monolithic source files, and not garner any of the benefits of good modularity. If, on the other hand, you learn how to modularize well, you just won't believe how easy your life (with respect to software development and maintenance, anyway) will become. There are a few applications (and, some believe, a few architectures) for which function call overhead is significant, but they are infrequent, and in general there should be absolutely no stigma attached to a function call. It's usually easy to back a function call out later, if you have to. (The normal way is with macros, but be careful lest the semantic differences between macros and functions hamper code quality.) >Rather than hand-optimizing >as you code (though one should certainly do a certain amount of >this as a matter of course), I am advocating developing a mindset >where this kind of "optimizing" becomes your natural coding style. >As an example of what I mean, when I am coding a program... >I don't write my > > for (i = 0; i < ALAST; ++i) > ... A[i]; > >and then change it to: > > for (ap = A; ap < &A[ALAST]; ++ap) > ... *ap; > >I write it this second way, as a matter of course. I am in complete agreement that there is a large class of good coding practices which is far better learned well and adopted from the beginning than applied retroactively. (I disagree that source-level microefficiency belongs in this class.) The judicious use of pointers does belong in this class, once you're comfortable with them, and as long as they're appropriate for the problem at hand. If you're not coding for a particular machine, or if it's low-frequency code, don't worry about using array notation if it makes more sense. Among other things, it may not make a speed difference, after all. Just yesterday a colleague and I were working with an FFT routine. (Interestingly enough, it was in the book "Numerical Recipes in C".) The first difficulty was the use of shifts instead of multiplies and divides. My colleague is a fine mathematician, and a competent C programmer, but "m=n>>1" just wasn't leaping off the page and saying "divide by two" for him, and he was interested in seeing how the algorithm worked. I was particularly annoyed with this particular shift, because it was in initialization code, so any savings wouldn't matter. Then I realized that there were also some shifts in inner loops, and was about to concede that the shifts in the initialization section were justifiable, for consistency's sake. Then I noticed that the "m=n>>1", which did appear in a loop, operated on a constant n which had been computed with "n=nn<<1", so if efficiency really mattered, a simple "m=nn" could have been used. (Side note on style: a variable called "nn" usually means twice "n," not the other way around.) Then I noticed an "istep=2*mmax" (also in a loop), thus demolishing any consistency argument. Shift vs. multiply is actually a side issue here. Things got interesting when I noticed that, in spite of their apparent concern with efficiency, the authors of the FFT code had used nothing but array references, even in the inner loops of the code (doubtlessly because it had been translated from Fortran). "Well," I said to myself, remembering a favorite sentence from The Elements of Programming Style, "FFT's are an area in which efficiency does matter in practice, so let's see what happens if I replace those array references with pointers." The result, to my considerable surprise, but reinforcing the point all the more, was that there was no difference (30.59 vs. 30.54 seconds for 20 512-point FFT's on a PC AT.) I know that there are machines on which a difference would have been evident. The point here is that there are also machines for which there is no difference at all, either because they can implement subscripts efficiently, or because the floating-point operations completely dominate the timing. Previous posters have noted that there are also machines for which pointers could actually be slower. >: o Replace [simple if/then with ?:] >I have to disagree with this. >statements. Unfortunately, this particular change, when done with >complicated expressions, can make the code much harder to read... Agreed. >: o Organize control flow such that the most common cases go straight >: through, this is most important for pipelined machines. >While this is true, this tends to make the code *much* less readable... Absolutely. Control flow should be determined almost entirely by readability and stylistic considerations. >: o Avoid bit fields, most especially signed ones. >Try: don't use them at all, they tend to be nonportable. This is a side question: what's so unportable about bitfields? Sure, the layout (left to right or right to left) isn't defined, but that's only a problem if you're trying to conform to external layouts, which is a "problem" with structures in general. (The solution is neither "don't use bitfields" nor "don't use structures" but "don't try to portably conform to external layouts.") The ordering could also be a problem if the code internally depended on it in some way, but this is no more or less a problem than programs which depend on byte ordering within words. Are there substantial numbers of real (not toy) C compilers that either don't implement bitfields, or that don't implement them correctly? ("Correctly" as defined by K&R and ANSI, not by some program that is trying to use them nonportably.) >: o Use realloc instead of malloc/free pairs. Use calloc instead of >: malloc followed by zeroing each member. >Efficient memory allocation is *much* more difficult than this, what >you really need to do is to cut down on the number of calls to malloc >and free. Of course, that usually means writing some kind of >allocation stuff yourself, never an easy task. Please don't avoid malloc; the alternative is generally fixed-size arrays and "lines longer than 512 characters are silently truncated." Please don't roll your own allocation routines, again unless you have to; this is the kind of low-level dirty work, hard enough to do well, that you should let the lower levels (i.e. the standard implementations of malloc) give it their best shot. >Also, calloc is nearly useless for clearing structures that contain >pointers. It is just not portable to assume that a bit pattern of all >zeros is equivalent to a null pointer. Absolutely correct. Why does calloc exist? Of what use is it? Why has ANSI legitimized it? In principle, it is equally useless for clearing structures that contain floating-point fileds. >: > o Avoid things like, >: > a %= 4; >: > Replace with, >: > a &= 3; > >: In the case that the "4" might change, parameterizing the >: first case will give correct code after it changes, while >: the second will break unless another power of two is used >: for the replacement for "4". Thus the first is SAFER. > >Unless, of course, the programmer remembered to COMMENT. If the code reads a &= 3; /* really a %= 4 */ or a &= 3; /* really a %= HASHSIZE */ and I do a global replace of 4, or re#define HASHSIZE, the comment may not help. >: Micro-efficiency gets regularly denounced. Here are some counterarguments: >: o If your commercial product is slower than the competition, you >: won't get a chance to take advantage of the maintainability/ >: portability advantages because you'll be out of business. >Or if it is larger. This discussion has focused on making programs >faster, but making them smaller is also relevant. I agree, and find code size far more frequently relevant in practice (among other things, because of the aforementioned pdp11). Remember that the classic tradeoff is between time and space, so the fancy efficiency hacks often make the code larger. On the other hand, there is a quote of Brian Kernighan's or Dennis Ritchie's (which I can't find at the moment) which points out that, counter to the tradeoff, clean, tight code is often smaller and faster. (Emphasis on clean. Of course, I advocate sneaky hacks to decrease code size as much as I advocate efficiency hacks, i.e. not at all.) I might also point out that the marketplace generally rewards the product that comes out first, so if you can compress your development schedule by using clean, straightforward design and eschewing time-consuming and bug-prone efficiency tweaking, you may come out ahead (and sew up your market share and your pocketbook by releasing a speeded up version 2 six months later). >: o The sum of all the tweaks can make the difference (on the PC) >: between using a small memory model and the large. This gives >: a lot of leverage to seemingly minor changes. >The sum of all tweaks can make a difference on lots of machines. >Consider the Mac 32K segments, the PDP-11 64K+64K address space, the >user forced to wait for 10 seconds instead of 4 because of the 5% >slower routine which caused the scheduler to lower his priority.... While these considerations are real, they should be viewed as unfortunate and temporary limitations which will be resolved, hopefully soon enough, at the root level, and not something which we resign ourselves to working around forever, and which we get so used to working around that the workarounds become part of the language, persisting well beyond their need. These are what Fred Brooks calls "accidental problems," and every programmer minute and line of code spend catering to them is not being spent on real problems. >: o I derive a lot of personal satisfaction from creating 'lean >: and mean' code. >This is often disregarded: good programmers get that way because they >*care*, and if they care, they will insist on doing the best they >can. And that should include these "tweaks". I'll third the motion, but without the tweaks. My proudest code is that which is not only small and fast and elegant and portable and modular and extensible but also patently obvious in function to the casual observer, and demonstrably correct without exhaustive analysis. That may sound like a tall order, but I think I'm fairly successful at it, and as I've implied I'll compromise on the first two (size and speed) before the rest, and the last two (obvious and demonstrably correct) are non-negotiable. * * * I'm not going to try to second-guess all of the replies I will undoubtedly get from the efficiency aficionadoes out there, but I will mention two things. I keep saying "don't do unless you have to," by which I mean after actual experiments on prototype code demonstrate that there is a problem. The attitude of "don't worry about efficiency until later" is frequently denigrated as leading to unsalvageably inefficient code, because trying to patch in some efficiency later is tantamount to a rewrite. Although this can be true, the solution is to teach people good, responsible programming practices early, avoiding gratuitous and unnecessary inefficiency, without teaching them to "optimize too early." It's been said before, and I'll say it again and keep saying it: Don't Optimize Too Early. If, once the program is working correctly, it's too slow, profile it and then see about optimizing the routines that will make a difference. This absolutely does not mean that you should write throwaway code; if the prototype is a hodgepodge, you won't be able to go back in and speed it up cleanly, and you also won't be able to go back and add any features without breaking it (on the off chance that it isn't badly broken already). Responsible programming practices are just what the articles I am reacting to are trying to formulate, and all I'm trying to do is to draw the line a little more finely between what's reasonable and what's overboard. My second point, and the reason I'm taking all of this time and space replying, is that people who are learning programming (and we're all still learning programming) are much more impressionable than you might think. There's nothing wrong with this; it's actually refreshing given that it is often supposed that people stop learning at around age 20. But it does mean that if you unilaterally tell a reasonably green programmer to use pointers instead of arrays, he'll spend months having lots of frustrating problems with pointers, and likely end up hating them and the language in general. Even the most obfuscated efficiency tweaks do occasionally have a place, when speed really matters and the hardware (or whatever) isn't up to it, but these techniques tend to be discussed and advocated out of proportion to their usefulness. None of us need more excuses or encouragement to write tricky code. There are far better problems to be expending effort on. Steve Summit scs@adam.pika.mit.edu