Path: utzoo!utgpu!news-server.csri.toronto.edu!clyde.concordia.ca!mcgill-vision!snorkelwacker!usc!samsung!uakari.primate.wisc.edu!uflorida!haven!mimsy!chris From: chris@mimsy.umd.edu (Chris Torek) Newsgroups: comp.arch Subject: Re: Is handling off-alignment important? Summary: bad antecedent for `this' Keywords: VAX, quad-word, alignment Message-ID: <26506@mimsy.umd.edu> Date: 12 Sep 90 13:19:33 GMT References: <104037@convex.convex.com> <8840014@hpfcso.HP.COM> <410@news.nd.edu> Organization: U of Maryland, Dept. of Computer Science, Coll. Pk., MD 20742 Lines: 72 >[misstatement about VAX quad-word loads ignoring low-order bits] In article <26376@mimsy.umd.edu> I wrote: >Since there is no mention of this in the VAX architecture handbook ... >In article <1990Sep8.225345.745@quick.com> srg@quick.com (Spencer Garrett) suggested: >> What may well have happened is that some early LISP implementer just >>"tried it" and found that on his vax the low order bits were ignored. >>so maybe he went ahead and used it. *BUT* since it isn't in the manual, In article <410@news.nd.edu>, przemek@liszt.helios.nd.edu (Przemek Klosowski) writes: >But, IT IS IN THE MANUAL! [see page 33 of the VAX architecture book] I probably should have followed up to Spencer Garrett's posting myself. What I meant in <26376@mimsy.umd.edu> was `no mention of ignoring low order bits on movq', not `no mention of alignment requirements'. It is well-known that the VAX architecture (and therefore its handbook :-) ) allows arbitrary alignment for word and longword operations. Note, however, that there *are* some (exactly four, as far as I know) instructions that do require strict alignment, namely the interlocked queue instructions: insqhi remqhi insqti remqti All of these require that their queue be on a quadword boundary (and, further, that the relative offsets that make these objects into queues be multiples of 8 as well). If the address handed to one of these instructions is not valid, or if the queue offsets are invalid, you get a reserved operand fault (again, the bits are not ignored). Incidentally, these instructions are excruciatingly slow---about 150% of the time for an integer divide with FPA, or twice as long as a subroutine call, on an 11/780. (NB: I have no explanation as to why an interlocked instruction should be faster on a 750.) [begin text from an article in net.unix that I saved back in 1983] The following VAX instruction timings were obtained from a former DEC employee. I cannot vouch for their accuracy and have no idea how they were obtained. VAX-11/780 vs. VAX-11/750 vs. VAX-11/730 WITH FPA INSTRUCTION 780 750 730 750 730 INTERLOCKED INSERT + REMOVE 30.43 26.43 41.02 1.151 0.742 versus, e.g., c MOVL Reg, Reg 0.40 0.93 1.69 0.430 0.237 MOVL mem, Reg 0.84 1.67 4.94 0.503 0.170 MOVL Reg, mem 1.31 2.28 4.88 0.575 0.268 CMPL AND BLEQ 1.16 2.32 4.26 0.500 0.272 CMPL mem, Reg AND BLEQ 1.88 3.24 7.31 0.580 0.257 TSTL AND BLEQ 1.00 2.42 4.25 0.413 0.235 BRW 0.80 2.01 2.57 0.398 0.311 MULL2 Reg, Reg 1.85 5.68 12.05 0.326 0.154 MULL2 mem, Reg 2.50 6.55 15.14 0.382 0.165 MULL2 Reg, mem 2.48 6.41 15.11 0.387 0.164 DIVL3 Reg, Reg, Reg 9.64 8.88 16.15 1.086 0.597 CALLS #0, ROUTINE + RET 14.75 20.87 36.61 0.707 0.403 -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 405 2750) Domain: chris@cs.umd.edu Path: uunet!mimsy!chris