Path: utzoo!utgpu!water!watmath!clyde!att!osu-cis!tut.cis.ohio-state.edu!mailrus!ames!amdahl!pyramid!prls!mips!earl From: earl@mips.COM (Earl Killian) Newsgroups: comp.arch Subject: Re: CISC instructions Message-ID: <2912@wright.mips.COM> Date: 28 Aug 88 06:52:05 GMT References: <13254@mimsy.UUCP> Lines: 70 In article <13254@mimsy.UUCP>, chris@mimsy.UUCP (Chris Torek) writes: [much analysis of VAX CALLS instruction -- a classic CISC mistake] ... > If I am allowed to avoid the standard stack frame format, I can cut the > time to 7.1 seconds: > > /* modified open coded call */ > .globl _null > _null: movl sp,fp # build frame > rsb > > .globl _main > _main: .word 0 > movl $1000000,r11 > 0: movq ap,-(sp) # save ap, fp > moval 4(sp),ap # new ap > jsb _null # call > movq (sp)+,ap # restore ap, fp > sobgtr r11,0b > ret > > This is somewhat less realistic as no registers are saved, and none > restored; if `null' were to use some, it would have to read: > > _null: pushr $mask # save local registers > movl sp,fp # build frame > /* body */ > movl fp,sp # set up for return > popr $mask # restore registers > rsb > > which adds three instructions, two of them relatively slow (pushr and > popr), changing the time (for mask=0) to 8.7 seconds. You can do better than this. Here's the output from a compiler that's been around for 4 years or so, using its -fast_call option: /* pastel 2.3 compiled test.p on 27 August 88 22:16 PST by earl 4 statements, 8 instructions in 29 bytes, 0 static bytes */ .globl _a .globl _pascal_runtime__main_end .globl _pascal_runtime__main_start _a: # 1 procedure a(); rsb # 5 program test; .globl _main _main: jsb _pascal_runtime__main_start/* ?, main_start */ # 9 for i := 1000000 downto 1 do begin movl $1000000,r2 /* ?, i */ L107: # 10 a(); bsbb _a decl r2 /* i */ bneq L107 jsb _pascal_runtime__main_end/* ?, main_end */ rsb On a VAX 780 this is 3.3 seconds, whereas yours is 9.3 seconds, or 2.8x slower. You mistakenly assumed you need a frame pointer, and used an argument pointer, both of which the compiler-generated code avoided. No registers saved is actually quite realistic, when the call protocol uses a mix of both callee and caller-saved registers. -- UUCP: {ames,decwrl,prls,pyramid}!mips!earl USPS: MIPS Computer Systems, 930 Arques Ave, Sunnyvale CA, 94086