Xref: utzoo comp.lang.misc:7689 comp.object:3429 Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!zaphod.mps.ohio-state.edu!samsung!uunet!mcsun!ukc!icdoc!qmw-cs!eliot From: eliot@cs.qmw.ac.uk (Eliot Miranda) Newsgroups: comp.lang.misc,comp.object Subject: Re: Type Systems and Dynamic Binding Message-ID: <3626@sequent.cs.qmw.ac.uk> Date: 1 May 91 13:06:48 GMT References: <3618@sequent.cs.qmw.ac.uk> <7w4813w164w@mantis.co.uk> Followup-To: comp.lang.misc Organization: Computer Science Dept, QMW, University of London, UK. Lines: 80 In article <7w4813w164w@mantis.co.uk> mathew@mantis.co.uk (mathew) writes: >eliot@cs.qmw.ac.uk (Eliot Miranda) writes: >> In article <1991Apr19.132239.9252@daffy.cs.wisc.edu> quale@saavik.cs.wisc.edu >> > A compiler for a dynamically typed language can dedicate a register to >> > hold a bitmask that will speed type tag operations. A C program would >> > either use an immediate value or a global variable, either of which is >> > slower and bulkier on many architectures. >> >> I have tried this in my Smalltalk VM. On both 68020 & SPARC there is no >> significant difference in performance attributable to dedicating a register >> to tag detection. Add a register here loose it there. > >You're saying that you've tried keeping the tags in memory vs. keeping them >in a register on a 68020, and you've not noticed any difference? I find this >a little hard to believe. > >Exactly what changes did you make to the code in order to use register-based >tags rather than memory-based tags, and what sort of program did you run in >order to test the result? > The program is a "dynamic translation to direct threaded code" Smalltalk-80 virtual machine written in C, compiled with GCC & edited into threaded code by sed-scripts run on the compiler generated assembler. The system is about 20,000 lines of C. The system is tested by running the standard Smalltalk-80 macro benchmark suite, which produces a performance figure in percents of a Dorado, Xerox's fastest D machine. The format of tagged data is tags in the bottom 2 bits 01 -> 30 bit signed integer 10 -> 16 bit unsigned character 11 -> 30 bit fixed point (16 bit fraction) 00 -> Ordinary Object Pointer Using GCC one can declare global register variables, e.g. register void (**tcip)() asm("a5"); declares the threaded code instruction pointer to be in a5 on 68020s. To test tags using immediate masks I use: #define TagMask 3 #define isTagged(o) ((unsigned long)(o)&TagMask) To test tags using a tag register I use: #define TagMask 3 register unsigned long tagReg asm("d3"); #define isTagged ((unsigned long)(o)&tagReg) ... tagReg = TagMask ... The code generated by the two sequences is either movel a3@,d6 moveq #3,d2 andl d6,d2 jeq LABEL or movel a3@,d6 andl d3,d6 jeq LABEL When I compared the two variants they ran the benchmarks to within 0.5% of each other. This difference is below the noise and hence not significant. I suspect that this is because a) moveq is very quick b) reserving a register for the tag mask slows down other parts of the system, cancelling out. Baically optimizations are tradeoffs between representations, and each implementation of a given piece of functionality will have its performance and its cost. -- Eliot Miranda email: eliot@dcs.qmw.ac.uk Dept of Computer Science ARPA: eliot%dcs.qmw.ac.uk@nsf.ac.uk Queen Mary Westfield College UUCP: eliot@qmw-dcs.uucp Mile End Road Fax: 081 980 6533 (+44 81 980 6533) LONDON E1 4NS Tel: 071 975 5229 (+44 71 975 5229)