Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10.2 9/18/84; site wdl1.UUCP Path: utzoo!linus!decvax!tektronix!hplabs!hpda!fortune!wdl1!jbn From: jbn@wdl1.UUCP Newsgroups: net.micro.68k Subject: Re: FLAME!!! Re: EA orthogonality Message-ID: <435@wdl1.UUCP> Date: Wed, 22-May-85 15:24:18 EDT Article-I.D.: wdl1.435 Posted: Wed May 22 15:24:18 1985 Date-Received: Sun, 26-May-85 21:19:59 EDT Sender: notes@wdl1.UUCP Organization: Ford Aerospace, Western Development Laboratories Lines: 33 Nf-ID: #R:terak:-55700:wdl1:22700013:000:1968 Nf-From: wdl1!jbn May 22 11:38:00 1985 The idea is to make programs go fast. This requires a machine for which a a compiler can generate fast code. This is quite different from a machine for which it is easy to generate code. One of the easiest architectures for which to generate code is the true stack machine, where all operands are pushed on the stack and all operators take data from the stack and return it to the stack. The code for such machines is reverse Polish notation, such as HP calculators use. USCD Pascal P-code is the best known modern ``machine'' that works this way, but many hardware machines, starting with the English Electric Leo Marconi KDF9 in 1959, and many Burroughs machines from 1960 on, worked this way. The compilers are trivial. But you can't optimize effectively for a true stack machine. Nor can the machine overlap or pipeline operations effectively. Because all operations implicitly refer to the top of the stack, the independence of operations needed for pipelining is very difficult if not impossible to achieve. Pipelined machines typically have many registers; the instruction fetch/decode unit can then keep grabbing instructions and shipping them off to the functional units for execution until blocked by a reference to a register tied up by an operation in progress. The CDC6600 and IBM 7030 (STRETCH) circa 1965 were the first machines that worked this way, and the newer microprocessors are starting to use this technology. The Stanford MIPS machine does work this way, but lacks the hardware interlocks (called the ``scoreboard'' in the CDC6600) to cause instruction fetch/decode to block when a register conflict is detected; the compiler for the MIPS machine has to stick in no-op instructions if look-ahead would cause a register conflict. What the CPU designer concerned with speed really needs is a good background in optimizing compiler technology and some knowledge of the history of CPU architecture. John Nagle