Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!swrinde!mips!zalman From: zalman@mips.com (Zalman Stern) Newsgroups: comp.arch Subject: Re: endian etc Keywords: endianness??, cache Message-ID: <3407@spim.mips.COM> Date: 11 May 91 11:13:29 GMT References: <2496@cybaswan.UUCP> Sender: news@mips.COM Organization: MIPS Computer Systems, Sunnyvale, California Lines: 61 Nntp-Posting-Host: dish.mips.com In article <2496@cybaswan.UUCP> ex2mike@cybaswan.UUCP ( m overton) writes: > >A simple answer to all the problems with byte order etc, to the >authors mind, is to have a duplicate set of load and store instructions. >Since most RISC machines have very few of them, adding a set for the >opposite sense would surely be very easy. It wouldn't work with >instructions, of course, but I assume that is not a problem. > No, it doesn't help at all. Current RISC chips which support bi-endian operation simply xor a constant with the low order address bits on non-word operations. The constant changes depending on the byte-order bit. Storing a word, changing the byte order bit, and loading the same word will get exactly the same value (it will not be byte swapped). Words have a constant format in memory and byte addresses are modified appropriately. Hence a buffer full of bytes cannot be accessed by processes of different byte sex without word swapping the buffer. Moving the byte order bit into the opcode accomplishes nothing. (Except wasting a lot of opcode space. See below.) A real solution would be to add byte lane swapping hardware to the chip. This hardware would actually swap the bytes on every load and store depending on the byte order bit. (it can stay in the status register, it doesn't matter.) That way, memory is laid out so that bytes are always in the same place and word operations swap them around as necessary. If this were the case, buffers would not need to be byte swapped at all. So why don't we do this? The word over here in software is that the hardware is expensive in terms of space and time. Its very likely to end up on the critical path for loads and stores. Any impact on cycle time to support bi-endianess is a lose. (This makes sense to me, but I write code for a living. The guys on the other side of the building are probably falling out of their chairs laughing as they read this.) Another point about architecture: I wouldn't say RISC machines have relatively few load and store instructions, but either way, these instructions tend to have large (~16 bit) immediate offsets. Every such instruction will take one major opcode. A quick glance at an opcode table for MIPS-I indicates that putting the byte order bit into the opcode field would add 17 new opcodes. There are 24 free opcodes in MIPS-I. In MIPS-II, the 17 goes up and the 24 goes down such that there would not be enough opcodes. (And believe me, the opcode space got used for something a hell of a lot more useful than making the instruction set byte-wise bisexual.) Its even worse for a machine like the RS/6000 which has update and indexed variants of its load/store instructions. > Wouldn't it be very easy on a machine with a write back cache >to copy words simply by changing the internal cached address ( a little >like a form of cache aliasing). A lot of time is spent in most code >just copying things around. Would this not improve things ( you gain >immediately on cache occupancy). This is one of many cache tricks you can play to make memory copies (bcopy) and memory fills (bzero) go fast. Mostly these sorts of things are only used inside the operating system because they aren't safe for unprivelleged code to use. One can imagine hardware designed to provide this functionality for user processes. -- Zalman Stern, MIPS Computer Systems, 928 E. Arques 1-03, Sunnyvale, CA 94088 zalman@mips.com OR {ames,decwrl,prls,pyramid}!mips!zalman (408) 524 8395 "Never rub another man's rhubarb" -- the Joker via Pop Will Eat Itself