Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10.2 9/18/84; site rochester.UUCP Path: utzoo!watmath!clyde!cbosgd!ihnp4!qantel!dual!lll-crg!gymble!umcp-cs!seismo!rochester!mayer From: mayer@rochester.UUCP (Jim Mayer) Newsgroups: net.arch Subject: Re: What I miss in micro-processors (fairly long) Message-ID: <11462@rochester.UUCP> Date: Sun, 8-Sep-85 11:59:22 EDT Article-I.D.: rocheste.11462 Posted: Sun Sep 8 11:59:22 1985 Date-Received: Wed, 11-Sep-85 06:09:09 EDT References: <796@kuling.UUCP> <1713@orca.UUCP> Reply-To: mayer@rochester.UUCP (Jim Mayer) Organization: U. of Rochester, CS Dept. Lines: 86 Keywords: compilers, optimization, procedures Summary: Fast calling sequences, inline expansion. In article <1713@orca.UUCP> jans@orca.UUCP (Jan Steinman) writes: >In article <796@kuling.UUCP> grzm@kuling.UUCP (Gunnar Blomberg/ADB) writes: >>... there are a couple of features I've been missing rather badly... >>... >>2. A quick subroutine call. >> >> What I would really like to see here is something like the PDP-10 JSP >>instruction, something that jumps to a subroutine and puts the return >>address in a register instead of on the stack. > >I hope you never try anything recursive this way! Current popular recursive- >capable languages don't have any way to guarantee a given routine will never >be used recursively, so compilers for such languages (Pascal, C, Ada, etc.) >would never use such an instruction. As for non-recursive assembly code, I >would look carefully at re-designing your entire calling scheme. If time- >optimal, macrofy and in-line the calls (with the added advantage of being able >to use arbitrary registers). If stack space-optimal, write a macro that >sticks the PC in a register, then jumps. > >Guess what I'm really trying to say is that I don't see enough use for either >of these things to justify the effort needed to put them in. >.... I would also like to see a simple, register based, calling sequence on microprocessors. Here are my reasons: 1. Simple compilers for most languages usually start off procedures by saving a bunch of registers, and finish by restoring the same set. The trend for CISC machines (vax, 68000, etc) is to provide some variant on the "push multiple registers"/"save according to register mask" theme. Given this kind of prelude, it doesn't really matter whether the PC is pushed on the stack at the JSR, or at the MOVEM. Recursion still works fine. And since simple compilers usually don't do to well at allocating registers across procedure calls, the cost of the tied up register is not to great either. This assumes, of course, a register intensive architecture. 2. It is quite easy to detect a large class of routines in ANY LANGUAGE that are not called recursively. In the most general case, just build a call graph and look for cycles. Given a more traditional compile&link compiler architecture it would still pay to look for routines that don't call any other functions (or any functions outside of the current module/source file). 3. Finally, to throw a new twist into the time optimal/space optimal tradeoff, consider the effect of compiler generated specialized calling sequences. Given some run time statistics a compiler can identify calls that would benefit from inline or specialized calling sequences. The same techniques can be used to propagate information about register usage. In fact, one can view inline expansion as nothing more than one extreme of calling sequence optimization. The optimizations are driven by execution time profiling information and estimates of the potential for further traditional optimizations constrained by a space bound. For more information see Gene Ball's Ph.D. dissertation: John Eugene Ball Program Improvement by the Selective Integration of Procedure Calls Ph.D. Thesis University of Rochester, 1982. Also, for a simpler approach that only handles inline expansion and does not consider the effects of further optimizations see: Robert W. Scheifler An Analysis of Inline Substitution for a Structured Programming Language CACM, Volume 20, Number 9 September 1977, pages 647-654. 4. I guess that where I differ from the original poster is that I don't want a stack based calling sequence at all! Stack based calling sequences on machines with lots of registers seem to me like a perfect example of a premature optimization. My general feeling is that this kind of "low level" high level language support (eg. stack based calls, index instructions, bounds checking) is usually not worth adding since an optimizing compiler can do a better job of speeding things up anyway. On the other hand, specialized high level microcode/hardware support for things like checksums, context switching, message passing, and FFT's can be big wins. I'm especially fond of writable control stores. Boy is this long. -- Jim Mayer University of Rochester (arpa) mayer@Rochester.ARPA Department of Computer Science (uucp) rochester!mayer Ray P. Hylan Building (via allegra, decvax, or seismo) Rochester, New York 14627