Path: utzoo!yunexus!geac!syntron!jtsv16!uunet!lll-winken!lll-tis!ames!mailrus!cwjcc!tut.cis.ohio-state.edu!bloom-beacon!athena.mit.edu!scs
From: scs@athena.mit.edu (Steve Summit)
Newsgroups: comp.lang.c
Subject: Re: Efficient coding considered harmful?
Message-ID: <7700@bloom-beacon.MIT.EDU>
Date: 28 Oct 88 10:28:49 GMT
Article-I.D.: bloom-be.7700
References: <3105@hubcap.UUCP> <34112@XAIT.XEROX.COM> <1700@dataio.Data-IO.COM> <8630@smoke.ARPA> <1704@dataio.Data-IO.COM> <119@twwells.uu
Sender: daemon@bloom-beacon.MIT.EDU
Reply-To: scs@adam.pika.mit.edu (Steve Summit)
Lines: 428

In article <119@twwells.uucp> bill@twwells.UUCP (T. William Wells) writes:
>It's the damnedest thing, but people whose opinions I otherwise
>respect seem to have this thing about coding efficiently.  They don't
>like it. Worse, they discourage it.

Some time ago I came to the conclusion that the style vs.
efficiency debate is another religious war, which I usually try
to stay out of.  Good style and, I'll admit it, anti-efficiency
is a personal crusade, though, so I can't help but respond.

None of this should be taken as a personal attack against Mr.
Wells, or anyone else, for that matter.  His points are well
taken; I simply have a different point of view, going back to
first principles, as all good religious wars do.  When caught
between a rock and a hard place, when something's got to give,
some people shrug their shoulders and sacrifice style,
modularity, or portability.  I am prepared to sacrifice
efficiency.  (Most of the time I can have my cake and eat it too,
and I have any number of general techniques for doing so, which I
will probably defer to a separate posting so that those who are
sensibly ignoring these diatribes can still see them.)

>I advocate training oneself to be one who codes efficient programs. I
>do not mean writing code and then making it more efficient, I mean
>writing it that way in the first place.

Here I agree with at least some of the words, but not the
implications.  It happens that I usually write nice, efficient
programs, but not by resorting to microefficiency tweaks or by
sacrificing readability in any way.  It is important to employ
lots of common sense (and, of course, "common sense isn't"); to
shy away from grotesquely inefficient algorithms; and to partition
the code so that key sections can be cleanly replaced later if
efficiency does prove to be an issue.  It isn't important to
write every last expression and statement in the most
theoretically blistering way possible.

>I have heard three main counterarguments to this:
>1) "The compiler ought to handle that, so you shouldn't bother with
>   it." What nonsense! We have to code in the real world, not the
>   world of make-believe and ought-to.

Many of the more egregious examples of source-level
microefficiency tweaking are, in fact, routinely handled by
today's (and even yesterday's) compilers.  Consider replacing
power-of-two multiplies by left shifts, a textbook example.
I just booted up my pdp11 (I am not making this up), and its
optimizer knows how to shift left when I multiply by two.  (To be
sure, not all newer compilers have been better.)

>3) "Efficient coding makes obscurer programs." Well, since most
>   efficient coding involves minor changes from the inefficient way
>   of doing things, changes which are mostly a matter of style rather
>   than of organization...

I don't know what is meant by "most efficient coding."  The
interpretation "the most efficient coding involves the most minor
changes" is probably not the one that was intended, although I
like it, because taken to its extreme it says not to make any
changes at all.  I cannot agree with the more likely
interpretation.  The minor changes, the replacements of
multiplies and divides by shifts and the like, are mostly noise
(both with respect to efficiency and style).  The really big
efficiency hacks, the ones people spend weeks and months on, do
involve massive organizational changes and wholesale concessions
to readability and maintainability.

>   ...this argument should really be read: "*I*
>   don't like to or find it hard to read that kind of program, so it
>   is unclear."

The attitude of most C experts is "*I* don't have any problem
reading this code; anyone else who considers himself a Real C
Programmer shouldn't, either."  This attitude is patronizing and
wrong.  To borrow a previous argument, "We have to code in the
real world, not the world of make-believe."  There are plenty of
people out there who are still learning C, and since C has become
as popular as it has and C programmers are in demand, many of
them are working for real companies writing (and maintaining)
real programs.  A few painless concessions (like multiplying
instead of shifting, or making comparisons against NULL pointers
explicit) really do substantially increase the readability of
code by mere mortals.  (I maintain that every unnecessary
translation from an encoding back to an intended meaning, no
matter how intuitive to the experienced programmer, incrementally
increases comprehension time, effort required, and probability of
error.)

>...look for redundant calculations: sometimes it is
>cost effective to store the result of a previous calculation instead
>of recomputing it.

...but be scrupulously careful, and comment the stored result
well, because doing so is one, extremely common, "concession to
maintainability."  One of the all-time, number one bugs is the
redundant variable whose value becomes wrong because one of the
precedent variables changes, but the calculation is not repeated.
If you use n squared at several different places within a longish
section of nontrivial control flow, and you try to replace it
with a new variable nsq which is set to n*n just once, sooner or
later someone (probably you) is going to make some change to the
program (either a new assignment to n, or a new wrinkle in the
control flow) which results in nsq not containing the square of
the current n.  If you leave the explicit n*n everywhere, this
error is impossible.

This advice, like almost anything, must be applied on a
case-by-case basis.  When the control flow is straight-through
and all occurrences of the common subexpression are well-
localized, a temporary variable is often a good idea, not so much
because it's easier to type or it might be more efficient, but
because it makes it easier for the reader to prove that the
identical quantity was really used in both (or all three, etc.)
places.

>Use register variables. And take a litle pain to get them right.  A
>technique that works enough of the time is this:...

I'm sorry, I don't have spare time or spare brain cells for
following or utilizing a technique like this.

When I was a beginning C programmer, I decided not to use
register variables at all in the first pass of coding, because I
figured that, when I eventually got the program working, somebody
was going to say that it wasn't fast enough, and if I had already
blown all of my ammunition, how was I going to speed it up?
Lately I throw in "register" all the time, but in some ways I
respect my original attitude more.  The point is, if it's the
sort of program that people notice how fast it is, they would
have complained even if I had used register variables since day
one.

I have also always felt that this is one area where the compiler
really should be doing the work.  Many modern compilers are very,
very good at register allocation.  It is true that the
historically popular compilers (specifically, the Unix ones) are
not, so it is good practice to put "important stuff" in
registers, but don't spend a lot of time on it unless you have
to.  (What's "important stuff?"  A ridiculously simplistic rule
of thumb for using "register" in C, albeit one of the two I use,
is "on any variable named 'i' or 'p'.")

>Avoid excess function calls. A lot of people have been brainwashed by
>some modularity gurus into believing that every identifiable "whole"
>should have its own function. What I have found is that this makes
>the program LESS maintainable than otherwise.

Here's another religious debate: the structured programming one.
Nevertheless, good modularity is the only way to keep a large
program maintainable.  Of particular pertinence to the theme of
this article is that efficiency-critical modules, if properly
isolated, can be implemented quickly and (possibly) inefficiently
first, while getting the program to work correctly at all, and
then transparently replaced later with a more efficient version,
if and only if it is found that the quick, inefficient
implementation is really too inefficient after all.

If people have a bad impression of highly modular code, it is
because they (or, more likely, their predecessors) have used what
I call a "Ginsu knife" approach to modular decomposition, whereas
the correct instrument to use is a scalpel.  If you went to a
lecture once on modularity but the only guideline you remember is
"50 source lines," you're going to use split -50 on your
monolithic source files, and not garner any of the benefits of
good modularity.  If, on the other hand, you learn how to
modularize well, you just won't believe how easy your life (with
respect to software development and maintenance, anyway) will
become.

There are a few applications (and, some believe, a few
architectures) for which function call overhead is significant,
but they are infrequent, and in general there should be
absolutely no stigma attached to a function call.  It's usually
easy to back a function call out later, if you have to.  (The
normal way is with macros, but be careful lest the semantic
differences between macros and functions hamper code quality.)

>Rather than hand-optimizing
>as you code (though one should certainly do a certain amount of
>this as a matter of course), I am advocating developing a mindset
>where this kind of "optimizing" becomes your natural coding style.
>As an example of what I mean, when I am coding a program...
>I don't write my
>
>	for (i = 0; i < ALAST; ++i)
>		... A[i];
>
>and then change it to:
>
>	for (ap = A; ap < &A[ALAST]; ++ap)
>		... *ap;
>
>I write it this second way, as a matter of course.

I am in complete agreement that there is a large class of good
coding practices which is far better learned well and adopted
from the beginning than applied retroactively.  (I disagree that
source-level microefficiency belongs in this class.)  The
judicious use of pointers does belong in this class, once you're
comfortable with them, and as long as they're appropriate for the
problem at hand.  If you're not coding for a particular machine,
or if it's low-frequency code, don't worry about using array
notation if it makes more sense.  Among other things, it may not
make a speed difference, after all.

Just yesterday a colleague and I were working with an FFT
routine.  (Interestingly enough, it was in the book "Numerical
Recipes in C".)  The first difficulty was the use of shifts
instead of multiplies and divides.  My colleague is a fine
mathematician, and a competent C programmer, but "m=n>>1" just
wasn't leaping off the page and saying "divide by two" for him,
and he was interested in seeing how the algorithm worked.  I was
particularly annoyed with this particular shift, because it was
in initialization code, so any savings wouldn't matter.  Then I
realized that there were also some shifts in inner loops, and was
about to concede that the shifts in the initialization section
were justifiable, for consistency's sake.  Then I noticed that
the "m=n>>1", which did appear in a loop, operated on a constant
n which had been computed with "n=nn<<1", so if efficiency really
mattered, a simple "m=nn" could have been used.  (Side note on
style: a variable called "nn" usually means twice "n," not the
other way around.)  Then I noticed an "istep=2*mmax" (also in a
loop), thus demolishing any consistency argument.

Shift vs. multiply is actually a side issue here.  Things got
interesting when I noticed that, in spite of their apparent
concern with efficiency, the authors of the FFT code had used
nothing but array references, even in the inner loops of the code
(doubtlessly because it had been translated from Fortran).
"Well," I said to myself, remembering a favorite sentence from
The Elements of Programming Style, "FFT's are an area in which
efficiency does matter in practice, so let's see what happens if
I replace those array references with pointers."  The result, to
my considerable surprise, but reinforcing the point all the more,
was that there was no difference (30.59 vs. 30.54 seconds for 20
512-point FFT's on a PC AT.)

I know that there are machines on which a difference would have
been evident.  The point here is that there are also machines for
which there is no difference at all, either because they can
implement subscripts efficiently, or because the floating-point
operations completely dominate the timing.  Previous posters have
noted that there are also machines for which pointers could
actually be slower.

>:     o   Replace [simple if/then with ?:]
>I have to disagree with this.
>statements. Unfortunately, this particular change, when done with
>complicated expressions, can make the code much harder to read...

Agreed.

>:     o   Organize control flow such that the most common cases go straight
>:         through, this is most important for pipelined machines.
>While this is true, this tends to make the code *much* less readable...

Absolutely.  Control flow should be determined almost entirely by
readability and stylistic considerations.

>:     o   Avoid bit fields, most especially signed ones.
>Try: don't use them at all, they tend to be nonportable.

This is a side question: what's so unportable about bitfields?
Sure, the layout (left to right or right to left) isn't defined,
but that's only a problem if you're trying to conform to external
layouts, which is a "problem" with structures in general.  (The
solution is neither "don't use bitfields" nor "don't use
structures" but "don't try to portably conform to external
layouts.")  The ordering could also be a problem if the code
internally depended on it in some way, but this is no more or
less a problem than programs which depend on byte ordering
within words.

Are there substantial numbers of real (not toy) C compilers that
either don't implement bitfields, or that don't implement them
correctly?  ("Correctly" as defined by K&R and ANSI, not by some
program that is trying to use them nonportably.)

>:     o   Use realloc instead of malloc/free pairs. Use calloc instead of
>:         malloc followed by zeroing each member.
>Efficient memory allocation is *much* more difficult than this, what
>you really need to do is to cut down on the number of calls to malloc
>and free. Of course, that usually means writing some kind of
>allocation stuff yourself, never an easy task.

Please don't avoid malloc; the alternative is generally
fixed-size arrays and "lines longer than 512 characters are
silently truncated."  Please don't roll your own allocation
routines, again unless you have to; this is the kind of low-level
dirty work, hard enough to do well, that you should let the lower
levels (i.e. the standard implementations of malloc) give it
their best shot.

>Also, calloc is nearly useless for clearing structures that contain
>pointers.  It is just not portable to assume that a bit pattern of all
>zeros is equivalent to a null pointer.

Absolutely correct.  Why does calloc exist?  Of what use is it?
Why has ANSI legitimized it?  In principle, it is equally useless
for clearing structures that contain floating-point fileds.

>: >    o  Avoid things like,
>: >               a %= 4;
>: >       Replace with,
>: >               a &= 3;
>
>: In the case that the "4" might change, parameterizing the
>: first case will give correct code after it changes, while
>: the second will break unless another power of two is used
>: for the replacement for "4".  Thus the first is SAFER.
>
>Unless, of course, the programmer remembered to COMMENT.

If the code reads

	a &= 3;		/* really a %= 4 */
or
	a &= 3;		/* really a %= HASHSIZE */

and I do a global replace of 4, or re#define HASHSIZE, the
comment may not help.

>: Micro-efficiency gets regularly denounced. Here are some counterarguments:
>: o       If your commercial product is slower than the competition, you
>:         won't get a chance to take advantage of the maintainability/
>:         portability advantages because you'll be out of business.
>Or if it is larger. This discussion has focused on making programs
>faster, but making them smaller is also relevant.

I agree, and find code size far more frequently relevant in
practice (among other things, because of the aforementioned
pdp11).  Remember that the classic tradeoff is between time and
space, so the fancy efficiency hacks often make the code larger.
On the other hand, there is a quote of Brian Kernighan's or
Dennis Ritchie's (which I can't find at the moment) which points
out that, counter to the tradeoff, clean, tight code is often
smaller and faster.  (Emphasis on clean.  Of course, I advocate
sneaky hacks to decrease code size as much as I advocate
efficiency hacks, i.e. not at all.)

I might also point out that the marketplace generally rewards the
product that comes out first, so if you can compress your
development schedule by using clean, straightforward design and
eschewing time-consuming and bug-prone efficiency tweaking,
you may come out ahead (and sew up your market share and your
pocketbook by releasing a speeded up version 2 six months later).

>: o       The sum of all the tweaks can make the difference (on the PC)
>:         between using a small memory model and the large. This gives
>:         a lot of leverage to seemingly minor changes.
>The sum of all tweaks can make a difference on lots of machines.
>Consider the Mac 32K segments, the PDP-11 64K+64K address space, the
>user forced to wait for 10 seconds instead of 4 because of the 5%
>slower routine which caused the scheduler to lower his priority....

While these considerations are real, they should be viewed as
unfortunate and temporary limitations which will be resolved,
hopefully soon enough, at the root level, and not something which
we resign ourselves to working around forever, and which we get
so used to working around that the workarounds become part of the
language, persisting well beyond their need.  These are what Fred
Brooks calls "accidental problems," and every programmer minute
and line of code spend catering to them is not being spent on
real problems.

>: o       I derive a lot of personal satisfaction from creating 'lean
>:         and mean' code.
>This is often disregarded: good programmers get that way because they
>*care*, and if they care, they will insist on doing the best they
>can. And that should include these "tweaks".

I'll third the motion, but without the tweaks.  My proudest code
is that which is not only small and fast and elegant and portable
and modular and extensible but also patently obvious in function
to the casual observer, and demonstrably correct without
exhaustive analysis.  That may sound like a tall order, but I
think I'm fairly successful at it, and as I've implied I'll
compromise on the first two (size and speed) before the rest, and
the last two (obvious and demonstrably correct) are non-negotiable.

			*	*	*

I'm not going to try to second-guess all of the replies I will
undoubtedly get from the efficiency aficionadoes out there, but
I will mention two things.

I keep saying "don't do <something> unless you have to," by which
I mean after actual experiments on prototype code demonstrate
that there is a problem.  The attitude of "don't worry about
efficiency until later" is frequently denigrated as leading to
unsalvageably inefficient code, because trying to patch in some
efficiency later is tantamount to a rewrite.  Although this can
be true, the solution is to teach people good, responsible
programming practices early, avoiding gratuitous and unnecessary
inefficiency, without teaching them to "optimize too early."

It's been said before, and I'll say it again and keep saying it:
Don't Optimize Too Early.  If, once the program is working
correctly, it's too slow, profile it and then see about
optimizing the routines that will make a difference.  This
absolutely does not mean that you should write throwaway code; if
the prototype is a hodgepodge, you won't be able to go back in
and speed it up cleanly, and you also won't be able to go back
and add any features without breaking it (on the off chance that
it isn't badly broken already).

Responsible programming practices are just what the articles I am
reacting to are trying to formulate, and all I'm trying to do is
to draw the line a little more finely between what's reasonable
and what's overboard.  My second point, and the reason I'm taking
all of this time and space replying, is that people who are
learning programming (and we're all still learning programming)
are much more impressionable than you might think.  There's
nothing wrong with this; it's actually refreshing given that it
is often supposed that people stop learning at around age 20.
But it does mean that if you unilaterally tell a reasonably green
programmer to use pointers instead of arrays, he'll spend months
having lots of frustrating problems with pointers, and likely end
up hating them and the language in general.

Even the most obfuscated efficiency tweaks do occasionally have a
place, when speed really matters and the hardware (or whatever)
isn't up to it, but these techniques tend to be discussed and
advocated out of proportion to their usefulness.  None of us need
more excuses or encouragement to write tricky code.  There are
far better problems to be expending effort on.

                                            Steve Summit
                                            scs@adam.pika.mit.edu