Path: utzoo!utgpu!news-server.csri.toronto.edu!bonnie.concordia.ca!uunet!zaphod.mps.ohio-state.edu!sdd.hp.com!news.cs.indiana.edu!noose.ecn.purdue.edu!mentor.cc.purdue.edu!pop.stat.purdue.edu!hrubin
From: hrubin@pop.stat.purdue.edu (Herman Rubin)
Newsgroups: comp.arch
Subject: Re: Compilers and efficiency
Summary: Recognizing the uses of unusual instructions
Message-ID: <11411@mentor.cc.purdue.edu>
Date: 27 Apr 91 12:05:16 GMT
References: <27fa3350.6bc2@petunia.CalPoly.EDU> <9782@mentor.cc.purdue.edu> <4082@batman.moravian.EDU>
Sender: news@mentor.cc.purdue.edu
Lines: 100

In article <4082@batman.moravian.EDU>, halkoD@batman.moravian.EDU (David Halko) writes:
> In article <9782@mentor.cc.purdue.edu>, hrubin@pop.stat.purdue.edu (Herman Rubin) writes:
> > In article <1991Apr5.172533.6717@agate.berkeley.edu>, doug@eris.berkeley.edu (Doug Merritt) writes:
> > > In article <1991Apr4.125122.1@capd.jhuapl.edu> waltrip@capd.jhuapl.edu writes:
> > > >	He observes that "Few compilers use any of the nifty instructions that
> > > >	a CISC has."  I believe I read recently that RISC was based on the
> > > >	observation that, in fact, only about 30% of the instructions in CISC
> > > >	computers were used by compilers.  The rest of the instructions, for
> > > >	all practical purposes, were just excess baggage.
> > 			.........................
> > > Anyway, the point is that even if compilers *do* use every possible feature
> > > of a CISC, it still usually won't make the CISC s/w competitive with RISC
> > > s/w. CISC cpus almost never put sufficient h/w optimization into the support
> > > of the fancy instructions. (There are exceptions to this, of course, and I
> > > haven't been watching the 030 and 040 to see how they do in this regard.
> > > CISC may yet rise again, but not until after superscalar RISC has been
> > > completely exploited.)
> > 			.........................
> > If the languages do not allow the user to communicate  the use of those
> > features of the CISC architecture which can make the object code considerably
> > faster, the compiler cannot take advantage of them.  This IS the situation.
> > For example, suppose there was an instruction, which could be effectively
> > used, which would divide a float by a float, obtaining an integer quotient
> > and a float remainder.  Now just how is the compiler to recognize that this
> > is in fact wanted?  
> > 			.........................
> What do you mean how is a compiler suppoosed to recognize that this is wanted?

I mean exactly that.  I am assuming that the operations are done by means of
a sequence of instructions, and not a call to some (possibly inlined) function.
I know of NO hardware on which I would consider calls to be reasonable where
short sequences of code will do the job.

It is not the case that the algorithm to be used is the same on different
machines.  Hardware differences can cause major problems, as was seen when
cache/paging first came in; an algorithm for matrix multiplication which
would seem to be better from the timing standpoint (same operations, fewer
loads, far fewer stores) caused page faults on the great bulk of its loads,
and thus became non-competitive for large matrices.

Consider the following two operations, which comprise more than 25% of the 
operations in some highly efficient, from the "picocode" standpoint, methods
for generating non-uniform random numbers.

1.  Examine a bit, and take some action depending on it.  Also, the next time
this is to be done, this bit effectively no longer exists.  It is necessary
to make provision at least somewhere for replenishing the supply of bits.

2.  Find the distance to the next one in a bit stream.  The same considerations
as above apply.

Try coding this in any present language, including assembler, in such a way
that a compiler can be expected to recognize one of these without being 
explicitly instructed.

I am not saying that this should not be the case.  But it is the user, not
the language designer, who has to do this.  Whatever language, present or
future, is used, human beings are going to come up with extremely simple
operations which the language designers have not thought of, and for which
doing those operations using the language primitives will be clumsy.

The user must be able to instruct the compiler about these operations
in a manner similar to the way it now looks at addition and multiplication.
If there are competing ways to do something, the compiler must know the
available variety.  This is even the case for adding vectors in C.

>     AT&T came out with a DSP board which utilized neato CPU features in its
> high level languages... since floating point multiply was so much faster
> than integer multiply, it would take integers, convert them to float, do
> the floating point multiplication, and convert back to integer again. Man,
> talk about fast and nifty. This was all handled internally, inside the C
> Compiler. It seems quite useful for integer division to do the same, here.
> There were, however, some minor flaws in the Compiler which took away from
> some of it's efficiency. I used to use global search and replace with specific
> patterns to optimize the code I produced...

If this was the case, the hardware was deficient.  To a mathematician, floating
point arithmetic is a complicated version of fixed point arithmetic.  This is a
point which the RISC designers do not see; the chip real estate for doing good
integer arithmetic is already there in the floating point stuff, and the design
should be such as to use it.  Floating point arithmetic uses fixed point 
internally, so this would not be a problem AT THE TIME OF THE DESIGN OF THE
ARCHITECTURE.  It is a problem afterward.

> > There is, at present, no language designed to take into account the collection
> > of "natural" useful hardware which the present users can find uses for, and
> > which are far more expensive in software than in hardware.
> 
> I think it may be more of a question of "Are CISC features being fully realized by
> compiler designers?"

Have any language designers or hardware designers ever asked for input on the
variety of operations which are readily understood by people like Silverman,
Lehmer, etc.?  Have language designers considered that vastly different 
algorithms should be used with different architectures?  Most optimization
procedures which I have seen assume not.
-- 
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399
Phone: (317)494-6054
hrubin@l.cc.purdue.edu (Internet, bitnet)   {purdue,pur-ee}!l.cc!hrubin(UUCP)