Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!sun-barr!apple!mips!winchester!mash From: mash@mips.COM (John Mashey) Newsgroups: comp.arch Subject: Re: R4000 "announcement" (64-bit stuff) Message-ID: <45789@mips.mips.COM> Date: 11 Feb 91 20:56:17 GMT References: <90@shasta.Stanford.EDU> <1991Feb8.055009.9883@ico.isc.com> Sender: news@mips.COM Reply-To: mash@mips.COM (John Mashey) Organization: MIPS Computer Systems, Inc. Lines: 226 In article <1991Feb8.055009.9883@ico.isc.com> rcd@ico.isc.com (Dick Dunn) writes: .... >With neither a price nor a delivery date, it's hard to see how MIPS has >announced anything to compete with existing products. The only "FUD" I see >right now should be in the hearts of damnfool programmers who have given us >so much code assuming int==long==32-bits, and who are now seeing apoca- >lyptic visions sooner than they thought they might. -Well, this looks as good a place as any to talk about this. We'll be getting out a good guide for programmers sometime, in next couple months, I hope. It may be worth recounting some history here, as one seeks not to repeat problems again. (Every mistake in the computer industry gets made at least 3 times: once by mainframe folks, once by minicomputer folks, and at least once by microprocessor folks. Sometimes supercomputer folks have probably made the mistakes before also! Also, it's always easy for anybody (like me) to say "mistakes" with hindsight.) Here are a few relevant quotes from "Computer Engineering: A DEC View of Hardware Systems Design", Digital PRess, 1978, Bell, Mudge, McNamara. I don't know if this book is still available, but it's a really excellent one, with a lot of good history, but also, excellent insights into technology trend curves that are still relevant today. I especially like chapters 1, 2, 9-17. From chapter 16 "The Evolution of the PDP-11", Bell & Mudge: "The biggest (and most common) mistake that can be made in a computer design is that of not providing enough address bits for memory addressing and management. ... For the PDP-11, the limited address problem was solved for the short run, but not with enough finesse to support a large family of minicomputers. That was indeed a costly oversight. ... it was realized that for some large applications there would soon be a bad mismatch between the 64-Kbyte name space and 4-Mbyte memory space. Thus, in 1974 architectural work began on extending the virtual address space of the PDP-11... This segmented address space ... was ill-suited to FORTRAN and most other languages, which expect a linear address space... Fortunately, the project was discontinued." and, from chapter 17, "VAX-11/780: A Virtual Address Extension to the DEC PDP-11 Family", Strecker: "For many purposes, the 65-Kbyte virtual address space typically provided on minicomputers (such as the PDP-11) has not been and probably will not continue to be a severe limitation. However, there are some applications whose programming is impractical in a 65-Kbyte virtual address space, and perhaps more importantly, others whose programming is appreciably simplified by having a large virtual address space." At one point in time, C and UNIX really KNEW that an int, and a pointer were 16-bits long. Then, C got "long", to at least do 32-bit calculations, and "int" became whatever was convenient. Inside Bell Labs, before UNIX got ported to other machines, C at least was available on various 32-bit architectures (like S/360). As UNIX got ported to other 32-bit machines, all of us bad people who'd figured int & char * were the same thing suffered, but everybody learned pretty quickly not to assume this, and how to write code that would work both on the 32-bitters (for future) and for 16-bitters (for installed base). Also, besides the bulk of the existing programs, which were, by definition runnable in the PDP-11's address space, additional programs appeared where: a) You'd been chafing at the 16-bit addressing limit, because array sizes or something wanted to be bigger. b) You'd been chafing, because you'd had to restructure your application to break it up. c) You hadn't even considered running it on a PDP-11, and you'd been running it on bigger systems, but you really wanted it on UNIX, and so now you moved it over. Of course, in the same time period, code also moved from PDP-11 & VAX to 3B's; people learned to parameterize for byte-ordering then, or, outside BTL, when all of those early VAX->>> 68K ports happened. By now, 3rd-party software is quite well-parameterized, although I have some lingering fear that people have once again been assuming char * and int are the same, as an awful lot of code is only on 32-bit machines. Now, looking at history, let's note a few things: There was probably more pain than there needed to be in the transition from PDP-11 to 32-bit systems. Still, it wasn't TOO bad, for the reasons below: a) By the time it happened, most the needed code was aleady written in a high-level language. This would have been much harder if the bulk of code had been assembly language. b) For a long time, many applications continued to be perfectly well runnable in the older environment, and in fact, many applications were able to use exactly the same source code in the old and new environments. On the other hand: c) People might have been a little happier if it had been trivial to run PDP-11 UNIX binaries at full-speed on their VAXen, while recompiling only those programs that needed it. Maybe this could have been done, or not. d) People would have been happier if the compiler approaches had been more common amongst PDP-11 and VAXen. Certainly, they were closer than some (for example: the 286 -> 386 transition), but there were enough differences that some strategies might want to be changed, not just low-level code. Now, WHY 64-bit? and why now? (The following is reasoningthat we used, with a WHOLE LOT of input from some of our friends, some of whom said we'd be nuts not to go 64-bit, if we could. It probably cost us 5-10% of the die space, and some time.) 1) DRAM gets 4X bigger every 3 years. There's every reason to expect this to continue through, at least, the 16Mbit and 64Mbit generations. People argue about 256Mb; I don't know enough to argue. 2) If you draw the curves, just of high-end Sun & MIPS servers, of physical memory offered, per year (horizontal axis), with a log-scale on vertical axis of memory size, you find: a) It's a straight line, not surprising, as it just follows DRAM. b) It crunches into the 4GB range around 1993. Maybe MIPS and Sun are nuts, but if so, they have at least the following company: HP, SGI, IBM, none of whose numbers have as reliably over the years, but they offer big machines, also. 3) Now, consider two rules of thumb: a) 4X: some real, non-lunatic-fringe programs will use 4X more virtual memory than they have physical memory (if the software allows for this sanely). In particular, file-mapping techniques burn virtual address space much faster than they use physical memory. Hennessy claims I'm being too conservative with 4X, that it's bigger, but I'm conservative. b) .5X: few people buy a maxed-out memory system, because they have nowhere to go. LOT's of people buy .25X or .5X memory systems, because memory is an effective way to solve many problems, and you tend to get 4X more of it every 3 years, at about same price. These two rules of thumb give you a graph, with a band, that intersects 4GB around 1991 (leading edge) or 1994 (trailing edge, where LOTS of people have .5X max memory systems). The conclusion from this data is that: a) Leading edge users of micros are already starting to run into the limits. (They are, by the way). b) By 1994, the issue will be fairly widespread (I'll say below what I mean by that, and to whom the issue is a problem.) AND THAT MEANS: 1991: better have chips that do it, and appropriate advice to software developers, because large ISV software takes a while to move (just as it took a while to clean up the PDP-11 16-bit stuff, especially in the' numerous applications floating around BTL: the OS was only a small chunk of the effort. NEED chips, so that: 1992: (no later than): better have systems getting out, so people can be debugging OS and finishing tests on compilers, debuggers, etc. Applications developers who care can either be working cleanups into their code, or developing new ones to take advantage of bigger addressing. 1993: One hopes that systems, with 64-bit compilers, tools, etc, and at least some OS support, getting into application developers' hands in reasonable numbers. Maybe some even getting to users. 1994: better have serious applications in users' hands. Now, some people claim that there's no way in the world you can do it that FAST. I claim that it's possible, although only just barely, i.e., all of this is only just in time, if you believe my data on DRAM trends. Now, this DOESN'T MEAN: a) That every PC-user is doomed if they don't rush out and get a 64-bit processor :-) b) That most applications won't happily stay 32-bit forever. However, certain applications want to use more than 32-bit addressing, regardless of the physical memory on the machine. There are almost always hacks to extend the physical addressing by a few more bits, and some of us have endured them. They mostly cause pain for OS folks, not applications people. Likewise, segmented addressing (although we don't believe in it), can be a reasonable solution for some problems. THe main issue is the ease of programming, and people vary in their opinions. Following are application areas likely to care about this issue. 1) DBMS: people use file-mapping more. Older, big DBMS already wanted more addressing space: consider 370/ESA for example, which has been around for years, and has a less-than-pleasant-to-program mechanism to get more bits. Consider that a big 1990 SCSI disk is 1GB (30 bits), and that people sell small deskside packages with 7GB already. Consider that in a just a few years, addressing every byte on 1 SCSI disk will overflow 32-bits... Obviously, there are ALWAYS ways to get around this with disk access; people have been addressing more for years; however, the further away you get, the worse it gets with hackeries. 2) Video: uncompressed, 1280x1024 screen, 24-bit color, 24 frames/second = 3.75MB, or 90 MB/second. 4GB = 45 seconds of video of this kind. 3) Document photographs & other high-quality images: 8 1/2 x 11" page, 24-bit color, 300dpi = 25 MB. 4GB = 160 pages. (likewise, uncompressed) 4) CAD environments: typically have big databases of complex-structured objects. Big-physical-memory servers keep the databases, and run monster simulations that consume both virtual and physical memory. WORKSTATIONS tend not to have such big memories, but want to rummage around in the databases for random slices of them, hopefully using the same software for sanity. On the simulation side, I confidently predict that there will be some ECAD folks here who'll want more than 4GB virtual space for some chip-verification thing, within next few years. ECAD gobbles space; I suspect MCAD is as least as bad. 5) Geographic Information Systems. Like 4) AND, OF COURSE: 6) Technical number-crunchers: have NEVER fit in ANY space :-) Anyway, the bottom-line conclusions were that there were reasons to get going on the support of this transition, because there was a small, but rather important fraction of applications that cared about this; if we didn't start now, we'd only have to redesign the fundamental integer unit in the very next spin of the chip; and that it takes a while for the software world to evolve. We do think that 64-bit desktops are necessary, not just to get software development to happen, but because certain kinds of end users will need it. NOBODY expects this means that one's word processor or paint program is suddenly obsolete. On the other hand, the kinds of applications that strain 32-bits are not little ones. If a simple programming model is available, it may make the difference between an applications being possible, and not. Anyway, that's the reasoning, right or wrong. An interesitng, and useful technical debate might occur regarding 64-bit flat address generation versus the various segmentation schemes that are currently found, both on technical merits, and also, for the software writers' viewpoints. -- -john mashey DISCLAIMER: UUCP: mash@mips.com OR {ames,decwrl,prls,pyramid}!mips!mash DDD: 408-524-7015, 524-8253 or (main number) 408-720-1700 USPS: MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086