Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Posting-Version: version B 2.10.2 9/18/84; site nmtvax.UUCP
Path: utzoo!linus!philabs!cmcl2!lanl!unm-la!unmvax!nmtvax!blaine
From: blaine@nmtvax.UUCP
Newsgroups: net.arch
Subject: Re: Stack architectures - why not?
Message-ID: <806@nmtvax.UUCP>
Date: Tue, 8-Oct-85 14:55:28 EDT
Article-I.D.: nmtvax.806
Posted: Tue Oct  8 14:55:28 1985
Date-Received: Fri, 11-Oct-85 08:15:33 EDT
References: <796@kuling.UUCP> <172@myriasa.UUCP> <1094@ulysses.UUCP> <>
Reply-To: blaine@nmtvax.UUCP (Blaine Gaither)
Followup-To: Artical 1260 (Chris Grey)
Organization: New Mexico Tech, Socorro
Lines: 45
Summary: 

In article <1260> cg.myriasa.UUCP (Chris Grey) writes:
> I've been told by a couple of people who are normally well informed that
> a pure stack architecture just isn't practical. They have NOT been able
> to convince me of this. Anybody out there want to try?

I will not respond directly to this question,  but rather provide some 
information from someone who has worked with both types of systems. 
Burroughs has had a stack oriented product line since arround
1964.  These machines fall into the Burroughs designation B5000-B7900 and
A-Series.   The performance of these machines spans the range from a VAX-750
to the largest 308x IBMs.  Contrary to some statements, it has not been very 
difficult to develop high end machines.  The low end machines are not a problem
now (due to small winchesters which can hold the MCP).  The Burroughs machines
have a segmented memory with tagged data.

Smaller machines execute the RPN dirrectly,  while larger machines can 
run into trouble with two problems:
   1.  Having to execute too many instructions/second (it takes more
       operators to do something in pure RPN).
   2.  Having too high a volume of memory reads and writes from the top
       of stack.

These problems are not difficult to get around.  First, operator concatenation
is used on high end machines to fold RPN onto a register machine dynamically!
This is rather like a small peephole optimizer in hardware.  The RPN notation
is very usefull in that it does not tie you down to a particular number of 
registers (0 to N are fine).  In theory a tightly coupled configuration of
a small RPN executing machine(such as for handling datacom) and a larger 
concatenating machine could share the same object image, one maping onto
an x,y top of stack register pair while the other machine goes to GPRs.

The concatenated opcodes are as easy to optimize as any GPR machine.

Top of stack memory accesses can be reduced by a special cache.

The essential thing to remember is that a stack instruction set does not
need to imply a stack processor design.

P.S.  Tags are not a performance problem.  They are a great help for
debugging.  Thunk heaven(:->)

Blaine Gaither
-- 
Blaine Gaither                    ucbvax!unmvax!nmtvax!blaine
Computer Science Department       blaine@nmt