Path: utzoo!utgpu!news-server.csri.toronto.edu!rutgers!usc!zaphod.mps.ohio-state.edu!uakari.primate.wisc.edu!crdgw1!crdos1!davidsen From: davidsen@crdos1.crd.ge.COM (Wm E Davidsen Jr) Newsgroups: comp.arch Subject: Re: Is handling off-alignment important? (was Re: RISC hard to program?) Message-ID: <2370@crdos1.crd.ge.COM> Date: 25 Jul 90 13:06:21 GMT References: <104037@convex.convex.com> <8840016@hpfcso.HP.COM> Reply-To: davidsen@crdos1.crd.ge.com (bill davidsen) Organization: GE Corp R&D Center, Schenectady NY Lines: 86 Please note that I have trimmed the quotes in the previous posting heavily, I'm not trying to distort the meaning of the original poster, just keep the size reasonable. In article <8840016@hpfcso.HP.COM> dgr@hpfcso.HP.COM (Dave Roberts) writes: | Yea, I agree with you if LRU is the strategy, but how do you know what | page replacement strategy is being used? This isn't usually something | that the hardware specifies. A fair question. Just as the hardware spec doesn't say the compilers have to allign things on word boundaries, if the hardware is such that certain software practices are needed, then they *are* implicitly given in the spec. | It is usually (read always :-) left to | the O/S to implement. The O/S could implement a FIFO strategy and | page 1 could have been the first page in. In which case the restart would cause a page fault, the first page would come back in, and the second restart would complete. | | I understand that the problem can be solved and many of the techniques | for solving it. What I'm interested in is: how much you pay for the | solution? How much time is spent trying to solve it? How much does | it cost in terms of overall performance? How often is it used? If | it's more expensive than doing aligned transactions even for | processors that support it, do users tend to try and make all their | transactions aligned? If so, why have it and thereby slow down | everything? Why not leave software to deal with the case when a user | has to do unaligned transactions? If you believe that when writing systems programs that sometimes you will have to access data which is not alligned, and I do, then the question is only if it should be done in hardware or software. This arbitrary data can come from another machine (not always even a computer), or be packed to keep volume down. If it is being done in software the source code has to contain a check for misallignment, which in turn means that the format of a pointer *on that machine* must be known, as well as the allignment requirements. Bad and non-portable. Or, you can simply access every data item larger than a byte using the "fetch a byte and shift" method. This requires that the byte order of the data, rather than the machine, be known. I think that's probably the only portable way. Alternatively the hardware can support unalligned fetch. It doesn't have to be efficient, because you would have to make an effort to make the fetch logic slower than software, it just has to work. This makes the program a bit smaller, and assuming that the chip logic is right, it prevents everyone from implementing their own try at access code. If the hardware could produce a clear trap for unalligned access (not the general bus fault, etc) the o/s could do software emulation. From the user's view that would look like a hardware solution. This is like emulating f.p. instructions in the o/s when the FPU is not present, and does not represent a major change in o/s technology. Note that this is not a RISC issue, in that the bus interface unit already may be doing things like cache interface, multiplexing lines, controlling status lines, etc. The BIU is not really RISC in that sense, it functions like a coprocessor if you draw a logic diagram, who's function is to provide data, which can go in the pipeline or into the CPU. | Do users really want to pay the | price all the time for this support or would they rather take a big | hit every so soften? Obviously some will and some won't but that's | the case for any architectural decision. These are the questions I'm | really trying to answer. You assume that there is a price all the time, and I'm pretty well convinced that the BIU in processors which have this capability, such as the 80486, don't have a greater latency for alligned access than the equivalent unit in SPARC or 88000. Not having the proprietary on chip timing I can't be totally certain, obviously. I think the real question is "should unalligned access be provided outside the user program?" I think the answer is yes. Obviously it can be done better in hardware, but if a chip is so tight on gates that it can't be without compromising performance elsewhere, then just a separate trap for quick identification of the problem by the o/s would be a reasonable alternative. -- bill davidsen (davidsen@crdos1.crd.GE.COM -or- uunet!crdgw1!crdos1!davidsen) "Stupidity, like virtue, is its own reward" -me