Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!mnetor!uunet!husc6!cmcl2!beta!hc!ames!amdcad!tim From: tim@amdcad.AMD.COM (Tim Olson) Newsgroups: comp.arch Subject: Re: What should be in hardware but isn' Message-ID: <18502@amdcad.AMD.COM> Date: Fri, 2-Oct-87 13:38:14 EDT Article-I.D.: amdcad.18502 Posted: Fri Oct 2 13:38:14 1987 Date-Received: Tue, 6-Oct-87 04:56:48 EDT References: <581@l.cc.purdue.edu> <28200048@ccvaxa> <340@oracle.UUCP> Reply-To: tim@amdcad.UUCP (Tim Olson) Organization: Advanced Micro Devices Lines: 28 In article <340@oracle.UUCP> bradbury@oracle.UUCP (Robert Bradbury) writes: | So from my experience the cost of mapping CISC functions into CISC instructions | can be quite a large part of the code generator of a compiler. Do the RISC | people have any measures of how much work goes into a RISC code generator | for things like DIV/MUL, STRCPY/MEMCPY or BRANCH scheduling? (Some of the code | published for the AMD 29000 indicates these aren't afternoon efforts :-).) In our "development" C compiler, div/mul, strcpy/memcpy are simply calls to the runtime routines to perform these functions, so there was no cost in the code generator for these. I didn't write the code generator, but the delayed-branch scheduling code in the optimizer is very small. | Have we gotten to the point where we can estimate the hardware development | costs of branch destination caching vs. the software development costs | of branch scheduling and trade them off against each other? The two aren't mutually-exclusive (the Am29000 implements both). Delayed-branches allow execution of instructions following the branch which are already in the pipeline, while the Branch Target Cache reduces or eliminates the latency involved in starting a new instruction stream. Perhaps you mean the tradeoff between delayed-branches and branch prediction? -- Tim Olson Advanced Micro Devices (tim@amdcad.amd.com)