Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Posting-Version: version B 2.10.2 9/5/84; site terak.UUCP
Path: utzoo!linus!philabs!cmcl2!seismo!hao!noao!terak!doug
From: doug@terak.UUCP (Doug Pardee)
Newsgroups: net.works
Subject: Not again!?  Assembler vs High-Level languages
Message-ID: <483@terak.UUCP>
Date: Thu, 4-Apr-85 11:51:16 EST
Article-I.D.: terak.483
Posted: Thu Apr  4 11:51:16 1985
Date-Received: Sun, 7-Apr-85 09:00:31 EST
Organization: Terak Corporation, Scottsdale, AZ, USA
Lines: 53

I continue to receive mail on the "assembly versus high-level language"
issue.  Many of my correspondents claim that a good optimizing compiler
can produce code as good as an assembler programmer can.

I obviously disagree.  If we are talking about "mediocre" programmers,
okay, I'll accept that a mediocre C programmer with an optimizing
compiler can produce as good of code as a mediocre assembler programmer.
But if we're talking about sharp programmers (like me :-), there is no
way that I can get a C compiler to produce as good of code as I can
write in assembler.

I decided to run a (not too badly rigged) experiment.  I took a Pascal
program and rewrote it into both C and assembler for the NS32016 chip.
The program I chose as the guinea pig is the ol' classroom favorite,
the Ackerman function.  This is a very recursive function, with little
actual computation.  I chose it because it was short and I knew that it
would give me the best results (from my point of view) in the test. (I
did admit to a bit of rigging...)

It took me about five times as long to write the assembler version as
the C version (no surprise).  My goal was to include as many refinements
as possible in the assembler version that were flat-out impossible to
code in C.  There were three major areas of attack.

First, the 32000 series *hates* to branch.  So I wrote the main
recursive loop to be totally straight-line code.  I also put all branch
labels on word boundaries; that cuts down the branching penalty.
Secondly, the instructions were carefully chosen based on the published
32016 timing charts.  And most importantly, the "call" overhead was
decimated by passing parameters in registers instead of on the stack,
stacking only the one variable that would be needed later, and by using
the 32000's simple call instruction instead of the "do-all" instruction
that C has to use.

And the results??

The optimized C program did not run faster than the assembler program.
It did not run as fast as the assembler.  It didn't take only 10%, or
only 25% longer.  It couldn't finish given even *twice* as much time.

In fact, it took more than five times as long as the assembler version.
The unoptimized C version took six times as long as the assembler
version.  And both C versions required 6 times as much memory for stack
space as the assembler program did.

I don't claim that the factor of 5 or 6 is at all representative.  I
did choose Ackerman because I knew that calling and branching are the
biggest chinks in the armor, and that's about *all* that Ackerman does.
But the point is still valid:  there is no way that an optimizing
compiler can equal a sharp assembler programmer when code efficiency is
crucial.
-- 
Doug Pardee -- Terak Corp. -- !{hao,ihnp4,decvax}!noao!terak!doug