Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!zaphod.mps.ohio-state.edu!think.com!spool.mu.edu!uunet!world!iecc!compilers-sender From: byron@archone.tamu.edu (Byron Rakitzis) Newsgroups: comp.compilers Subject: Re: Code folding from JSR/RTS -> {Local} Keywords: optimize, architecture Message-ID: <15709@helios.TAMU.EDU> Date: 2 May 91 16:35:56 GMT Article-I.D.: helios.15709 References: <9104262025.AA21840@enuxva.eas.asu.edu> <7524@ecs.soton.ac.uk> <1991May1.035622.25021@daffy.cs.wisc.edu> Sender: compilers-sender@iecc.cambridge.ma.us Reply-To: byron@archone.tamu.edu (Byron Rakitzis) Organization: College of Architecture, Texas A&M University. Lines: 48 Approved: compilers@iecc.cambridge.ma.us In article <1991May1.035622.25021@daffy.cs.wisc.edu> carter@cs.wisc.edu (Gregory Carter) writes: >I don't know if this is incredibly obvious or what, but in the OLDEN DAYS >when I was working on a 6502 machine. Speed was of the upmost...ad nauseum.. Hmm... are we seeing an "all the world's a 6502" attitude in this article? (although, I admit that my subsequent comments assume "all the world's a machine with slow memory and fast registers (e.g., most RISC architectures)") >What I want to know, if any of you have considered, for high speed >applications to get that extra push, what the problems would be in compiler >design to: >1) Transcribe all local variables to global ones. Ouch. This will kill register allocation on most compilers. You WANT your code in blocks with local declarations of variables; this gives an optimizing compiler a better chance at allocating registers over variable lifetimes. However, even the simplest optimizations can be killed by making your variables global; compilers like gcc usually make pessimal assumptions about how global variables are used: x = 1; /* assume x is global. gcc writes x to memory */ foo(); bar(x); /*gcc will reload x from memory, even if foo() does not touch */ /* x or the register that x was stored in */ /* (i.e., there's no interprocedural optimization on globals) */ >2) Replacing subroutine calls with the actual code. This is already done. >3) minimizing stack frame usage to almost ZIPPO. If you have a good compiler on a RISC archictecture, then you probably touch the stack frame as little as possible ANYWAY. Even return addresses are not stored on the stack in leaf subroutines on the MIPS, for example. (or rather, they shouldn't be!) Sorry to pick nits. Now, to address the ideas in your article: yes, it's clear that it should be possible for you to configure your compiler to trade space for time. Most of the optimizations you suggest are more readily accomplished by good interprocedural optimization, though. -- Byron Rakitzis, Texas A&M., byron@archone.tamu.edu -- Send compilers articles to compilers@iecc.cambridge.ma.us or {ima | spdcc | world}!iecc!compilers. Meta-mail to compilers-request.