Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!sdd.hp.com!zaphod.mps.ohio-state.edu!rpi!crdgw1!uunet!motcid!wallach From: wallach@motcid.UUCP (Cliff H. Wallach) Newsgroups: comp.arch Subject: Re: Let's pretend Keywords: Intel, 586, windows Message-ID: <5874@avocado5.UUCP> Date: 20 Dec 90 19:04:43 GMT References: <3058@crdos1.crd.ge.COM> <1990Dec19.052338.3911@kithrup.COM> <3068@crdos1.crd.ge.COM> <1990Dec19.223934.1568@kithrup.COM> Organization: Motorola Inc. - Cellular Infrastructure Div., Arlington Heights, IL 60004 Lines: 67 In article <1990Dec19.223934.1568@kithrup.COM> sef@kithrup.COM (Sean Eric Fagan) writes: -In article <3068@crdos1.crd.ge.COM> davidsen@crdos1.crd.ge.com (bill davidsen) writes: -- I meant what I said - "quantify" rather than qualify. Yes optimization --would be better and memory accesses would be down, but how much? - -Well... I could suggest you go read any recent (last decade or so) papers on -compiler optimization techniques, which would be chock full of them. Also -read papers on the RISC chips, and why the register and instruction sets -were chosen. - -Here is a sample of code: - - r2 = r3 = inb (0x3b8); - - r2 |= 8; - outb (0x3b8, r2); - -(r0 through r7 are declared locally as 'unsigned long r0, r1, ...;', and -inb and outb are declared as 'static inline unsigned char ...', and written -using inline assembly) - -Here is the code gcc generates for that: - - inb (%dx) - movl $952,-220(%ebp) - movw -220(%ebp),%dx - inb (%dx) - movb %al,-216(%ebp) - movzbl -216(%ebp),%eax - movl %eax,-216(%ebp) - movzbl -216(%ebp),%eax - movl %eax,-212(%ebp) - movl %eax,-216(%ebp) - movl -220(%ebp),%eax - movl %eax,-220(%ebp) - movl -216(%ebp),%eax - orl $8,%eax - movl %eax,-216(%ebp) - movw -220(%ebp),%dx - movb -216(%ebp),%al - outb (%dx) Is this code for real? - -Excercise for reader: assuming 16 reigsters, rewrite that code using only -r0 through r7 (which was all I had declared in my code). Then, take out an -intel book on the '386, and figure out the timings of the old code and the -new code (assume that the new register set will be accessed in the same -amount of time as the old register set, since I'm talking about completely -trashing the instruction set and redesigning it). - Exercise for compiler writers: Generate optimized code for a current architecture. Maybe something like: xor eax,eax mov edx,3b8h in al,edx mov r3[bp],eax or al,8 out edx,al mov r2[bp],eax Cliff Wallach ...uunet!motcid!wallach