Path: utzoo!utgpu!news-server.csri.toronto.edu!mailrus!cs.utexas.edu!sdd.hp.com!elroy.jpl.nasa.gov!jarthur!nntp-server.caltech.edu!laguna.ccsf.caltech.edu!daveg From: daveg@near.cs.caltech.edu (Dave Gillespie) Newsgroups: comp.arch Subject: Re: What *should* architectural pointers point at? Message-ID: Date: 31 Aug 90 08:36:47 GMT References: <0887@sheol.UUCP> <41167@mips.mips.COM> <141598@sun.Eng.Sun.COM> Sender: news@laguna.ccsf.caltech.edu Organization: California Institute of Technology Lines: 56 In-Reply-To: petolino@joe.Eng.Sun.COM's message of 30 Aug 90 20:28:14 GMT > = Joe Petolino, >> = Herman Rubin, >>> = Me >>>You could provide two variants of the load/store instructions, one set >>>that trap on unaligned accesses and one set that don't (but are possibly >>>much slower). > Although you don't say so, I assume that the slower variants would actually > give the right results. Remember that there are architectures out there > that neither trap nor work correctly when presented with unaligned addresses. Yes, I meant that one variety would trap, the other would do the extra work to do an unaligned access correctly. >>> This latter instruction replaces the specialized >>>"bit-field" instructions that some machines have now. Compilers could >>>have an option to generate only the slow-but-safe instructions for the >>>benefit of fast-but-reckless programmers. >>Why much slower? At most two items would have to be loaded, and a shift >>made. > The main reason not to allow unaligned accesses is that supporting > unaligned accesses greatly increases the *complexity* of the memory system > in a machine with virtual addressing and caches. Speed is probably not an > issue - in the implementations I've seen (various IBM 370 implementations), > you only pay a speed penalty if you actually use an unaliged address. > If you did implement instructions that work with unaligned addresses, there > would be little reason to also implement the trap-on-unaligned variants - > the complexity has to be there anyway. As an example of the kind of > complexity that arises with support for unaligned accesses, imagine a > store that spans a virtual page boundary, and only one of the pages is > write-protected. I thought perhaps the extra hardware to handle unaligned bit addresses might be unpleasant; you would have to shift, e.g., 64 possible ways, so you would probably want to use the ALU's shifter for this rather than a special shifter in the bus interface. So I envisioned one set of instructions that are slow because they need the ALU to be available, and one that can use a maximally-simple bus interface directly and can run simultaneously with ALU instructions. Of course, a clever processor could have a single load instruction that computed the address, tested if it was aligned, and, if not, only then went on the requisition the ALU. Then you would need only just type of instruction, but it would make the hardware more complicated. It would now be data-dependent whether a load instruction and an ALU instruction could be allowed to run simultaneously. Maybe barrel shifters are fast and easy enough these days that this isn't a big deal. But my guess is that it still is. -- Dave -- Dave Gillespie 256-80 Caltech Pasadena CA USA 91125 daveg@csvax.cs.caltech.edu, ...!cit-vax!daveg