Path: utzoo!attcan!uunet!zaphod.mps.ohio-state.edu!uwm.edu!ogicse!cs.uoregon.edu!mips!mash From: mash@mips.com (John Mashey) Newsgroups: comp.arch Subject: Re: bizarre instructions Message-ID: <658@spim.mips.COM> Date: 2 Mar 91 21:56:05 GMT References: <3025@charon.cwi.nl> <7063:Mar202:29:0091@kramden.acf.nyu.edu> Sender: news@mips.COM Organization: MIPS Computer Systems, Inc. Lines: 47 Nntp-Posting-Host: winchester.mips.com In article <7063:Mar202:29:0091@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes: >What I do think is practical is having each compiler writer at least >provide a portable idiom for each instruction on his machine. (Portable I'm NOT suggesting there is no use to the various kinds of inlining and other ways to get access to machine instructions that exist, as has been discussed. HOWEVER, I would observe that: a) People must be careful NOT to assume that function calls are inordinately expensive. This is simply NOT true these days. In particular, on most of the current RISCs, especially when aided and abetted by good register allocators, function calls only cost a few cycles; in particular, calls to leaf routines (i.e., those that do not call others) are usually very cheap, because they almost never save/restore any registers. b) Fast function calls ae necessary for many purposes. I cannot think of any popular architecture designed in the last 5-8 years that hasn't paid at least some attention to making fast function calls, one way or another. c) They solve a LOT of the problem being discussed. d) They do not solve 100% of the problem being discussed. However, it is always worth doing the cycle count for a given language and architecture choice between: 1) The BEST that you can possibly do, with hand-coding 2) The BEST you can do by writing leaf-level assembly programs on the machine. I believe that in most cases, measured over the complete program, that case 2) does pretty well, especially because the instructions you're trying to get to are often high cycle-count operations, where the % overhead for getting them is low. The one obvious exception is if you have low-cycle count operations (like rotate, or population count, or byte-swapping, or string-primitives, etc) that you'd like to get inlined, and have no way to express directly. At least C is moving, albeit slowly in the direction of better support for the possibility of these things. However, the bottom line still is: analyze the cycle count differences to see if mechanisms are worth it. Sometimes they are, sometimes they're not. -- -john mashey DISCLAIMER: UUCP: mash@mips.com OR {ames,decwrl,prls,pyramid}!mips!mash DDD: 408-524-7015, 524-8253 or (main number) 408-720-1700 USPS: MIPS Computer Systems MS 1/05, 930 E. Arques, Sunnyvale, CA 94086