Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!utgpu!water!watnot!watmath!clyde!rutgers!brl-adm!seismo!gatech!amdcad!bcase From: bcase@amdcad.UUCP Newsgroups: comp.unix.wizards Subject: Complaint about complex architectures Message-ID: <15341@amdcad.UUCP> Date: Wed, 1-Apr-87 11:58:13 EST Article-I.D.: amdcad.15341 Posted: Wed Apr 1 11:58:13 1987 Date-Received: Sat, 4-Apr-87 09:21:13 EST References: <15292@amdcad.UUCP> <978@ames.UUCP> <15694@sun.uucp> <5@wb1.cs.cmu.edu> <6042@mimsy.UUCP> Reply-To: bcase@amdcad.UUCP (Brian Case) Organization: Advanced Micro Devices, Sunnyvale, California Lines: 33 In article <6042@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes: >In article <5@wb1.cs.cmu.edu> avie@wb1.cs.cmu.edu (Avadis Tevanian) writes: >>... the 4.3 libc ... has been carefully optimized to use the fancy >>VAX instructions for the string routines. Unfortunately, some of >>these instructions are not implemented by the MicroVAX-II hardware. >>As it turns out, what is happening is that your tests (including >>Dhrystone) are causing kernel traps to emulate those instructions! > >Exactly. Strcpy, strcat, and strlen were all modified to use the >Vax `locc' instruction to find the ends of strings. This instruction >is not implemented in hardware in the uVax II. The obvious solution >is to arrange the libraries so that on a uVax, programs use a >straightforward test-byte-and-branch loop (see sample code below). This brings up one of my major beefs abouts complex archtiectures: an optimizing compiler might have to do different things depending upon the *version* of a CPU it is compiling for! An optimizing compiler that is considered "a great compiler" for one version of a CPU might be "a mediocre" compiler for the next version of the machine. The compiler writer found out that some obvious sequences of code are not the best for the current version of the machine, but then the implementors of the next version "fake him out" by changing the relative timings of the instructions (and take note of the fact that determining instruction timings for some machines, e.g. VAXs, is near impossible since DEC just won't tell you. This makes superior code generation a nightmare). One of the reasons that simple architectures are better for compilers is that (nearly) all instructions take the same amount of time and space. Thus, code generation and optimization are *much* easier. Also, this relationship of one time unit/one space unit per instruction is unlikely to change as a function of CPU version. bcase