Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!csd4.milw.wisc.edu!uxc!uxc.cso.uiuc.edu!mcdurb!aglew From: aglew@mcdurb.Urbana.Gould.COM Newsgroups: comp.arch Subject: Re: When is RISC not RISC? Message-ID: <28200276@mcdurb> Date: 17 Feb 89 14:07:00 GMT References: <747@atanasoff.cs.iastate.edu> Lines: 40 Nf-ID: #R:atanasoff.cs.iastate.edu:747:mcdurb:28200276:000:1831 Nf-From: mcdurb.Urbana.Gould.COM!aglew Feb 17 08:07:00 1989 >Andy also mentioned at one point that he thought string ops weren't >very necessary in a machine that had word ops with masks (or somesuch; >I don't recall exactly). I wanted to hear your reasons for that, Andy. >I sent you email, but it was bounced. Would you please post a msg with >more details? > > -Olin My reasoning: the most commonly used string ops are moves. There's only one way to make moves faster - move more data per cycle => larger busses [*]. Most string moves are small, so would, eg., be able to fit into a 128 bit, 16 byte, wide bus. Given that you can move wide words, how do handle misaligneds? By a decomposition of the word into power-of-two sized transactions - works, but gets more difficult as word size increases, and, in the dynamic case, requires decisions. Most byte-addressible architectures already have signals similar to "Store the data off the buss in this word only". Provide explicit control of these. Similar arguments apply for length strings. I don't have enough real data behind these statements, yet; but I've been flogging them for a few years, and have finally got support to examine them in detail. Since I need to invent the tools to do the study first, you could still probably beat me to publication - I wouldn't mind, just tell me so that I can do something more interesting. Now, I admit that John Mashey's statements about the inefficacy of optimizing strings tend to imply that this is not a very rewarding area of research, but I point out that the same things also apply to block moves - and, running profiling on my system I regularly see block moves occupying up to 10% of system time. Why is a discussion for an OS group. [*] Well, remapping and parallel moves are possibilities, appropriate for large moves, but probably not for the most frequent case.