Path: utzoo!utgpu!attcan!uunet!husc6!ukma!rutgers!rochester!pt.cs.cmu.edu!PLAY.MACH.CS.CMU.EDU!bsy From: bsy@PLAY.MACH.CS.CMU.EDU (Bennet Yee) Newsgroups: comp.arch Subject: Re: register save/restore Message-ID: <3473@pt.cs.cmu.edu> Date: 3 Nov 88 03:31:29 GMT References: <3300037@m.cs.uiuc.edu> <5938@killer.DALLAS.TX.US> <7580@aw.sei.cmu.edu> Sender: netnews@pt.cs.cmu.edu Organization: Cranberry Melon Lines: 90 In article <7580@aw.sei.cmu.edu> firth@bd.sei.cmu.edu (Robert Firth) writes: } Register Saving across Procedure Calls } }Which is better - caller saves or callee saves? }A. Is this the right question? } }First, and most important, if you are designing a professional-quality }production compiler, this is the wrong question. Such a compiler must }perform interprocedural optimisation if it is to be respectably state }of the art. } ... You must also consider the problems of separate compilation and multiple language applications. If register saving differs from module to module, you'd better have language extensions that allow you to specify external routines that you call to use some standard procedure call mechanism, as well as ways to specify that function that you're writing may be called by some external module and that it must likewise use a standard convention. The alternative is to require smart linkers. This is probably a religious issue, much like network byte ordering versus swap-only-as-required. }B. Are the strategies equally sound? } }The point I consider most important, is that there is a definite }semantic asymmetry between the two strategies. If the caller saves, }then the caller is saving, locally, his own local state. This seems }to me basically correct. If the callee saves, then the callee is }saving, local to him, state that belongs to someone else. } ... }This seems to me }semantically unsound. } } ... I'm going to }come out and say that "callee saves" is fundamentally wrong, and should }be avoided if possible, even at some cost. } }C. Which is more efficient? } }Happily, however, the efficiency arguments, in my experience, support }the "caller saves" strategy, so one can indeed do well by doing good. } }The most blatant case is that of the longjump, which appears in other }languages as a GOTO or RAISE statement. This causes a jump out of a }procedure to somewhere further up the call chain, and so must reset }the environment of the destination. If the caller saves state, then }this is simple: the jump is a jump, and the destination knows where }all the state has been saved. In most implementations, one need only }reset the frame pointer to the current incarnation of the destination }procedure, and take the jump to the label. } }But if the callee saves, then the caller has no idea how to recover }his saved state, which may be buried any number of stack frames further }down. It is therefore necessary to unwind the entire stack before taking }the jump. The difference in cost can easily be a factor of 100 or more. It's interesting to examine the ACIS implementation of longjmp/setjmp for the IBM RTs. The standard procedure call convention is callee-save, and longjmp does NOT unwind the stack. Contrast this with the Vaxen BSD implementation of longjmp/setjmp, which DOES unwind the stack. Vaxen BSD, of course, uses callee-save too. What is the difference? Well, for the IBM RTs, your registers have the same values as when they returned from the setjmp. On Vaxen, your registers have the same values as they had when you called the next function from within the same function that called the setjmp. So depending on one or the other behaviour for your register variables is not safe. It's a minor but significant semantic difference. [Anybody know what POSIX decided for this?] Now, how to avoid unwinding the stack and still retain the same semantics? It's actually not hard -- given that, for those ``other'' languages at least, GOTO and RAISE are part of the language, the compiler can just always save the contents of register variables before calling other functions _only for those functions that contain GOTO or RAISE_, and restore the registers variables when the exception occurs. Thus, you can get the efficiency of callee-saving (a big win for those often-used, little leaf functions that use only a few scratch registers) and retain the semantics that you want. Of course, it's hard to argue a similar case for C, since setjmp/longjmp is NOT part of the language.... And for those super-duper-smart compilers that puts a variable into a register for the first half of a function and another variable into the same register for the second half, unwinding the stack to restore registers from stack frames isn't quite enough either! -bsy -- Internet: bsy@cs.cmu.edu Bitnet: bsy%cs.cmu.edu%smtp@interbit CSnet: bsy%cs.cmu.edu@relay.cs.net Uucp: ...!seismo!cs.cmu.edu!bsy USPS: Bennet Yee, CS Dept, CMU, Pittsburgh, PA 15213-3890 Voice: (412) 268-7571