Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!tut.cis.ohio-state.edu!rutgers!apple!vsi1!wyse!mips!mash From: mash@mips.COM (John Mashey) Newsgroups: comp.arch Subject: Re: Criteria ... [really: are N designs better than 1?] Message-ID: <19088@winchester.mips.COM> Date: 9 May 89 09:14:51 GMT References: <2368@ogccse.ogc.edu> <1464@cfa.cfa.harvard.EDU> <141@dg.dg.com> <156@dg.dg.com> <658@pitstop.West.Sun.COM> Reply-To: mash@mips.COM (John Mashey) Organization: MIPS Computer Systems, Sunnyvale, CA Lines: 332 In article <658@pitstop.West.Sun.COM> (Adrian Cockcroft) writes: >In the real world the main thing to optimise is price/performance so we >should be looking at what MIPS/MFLOPS can be achieved within a given >development timescale, development budget and unit cost. Thats fine for Yes. >embedded controllers but for Unix boxes there is also the issue of applications Yes. >software. SPARC has over 500 applications, increasing at a very high rate. >What do the other vendors claim? Sun claims that SPARC has more >applications software than all other RISC systems put together in a recent >SPARCware glossy. Counting applications is even harder than counting mips-numbers. I believe Sun has a very good 3rd-party catalog, but counting total numbers is very akin to counting average-mips numbers. I'm tracking down some current data on this and will post something next week. What I do know is that Sun is ahead in some application areas, but not others... >Given the same development resources and the same types of implementation >that might be true. Right now there are more design teams working on more >different future SPARC implementations because it is licensed to TI, Fujitsu, >Cypress/Ross Technology, LSI, Prisma, Solbourne etc. The 88K only has >Motorola and MIPS do all the chip designs themselves. I think that the >competition between SPARC vendors to have the best performance will >encourage more innovative developments. Right now Cypress has 40 MHz SPARC >available, MIPS are at 25 MHz? 88K at 20 MHz? I thought this was a dead topic, but since it has now been raised again, I'm forced to point out some FACTS (note that the above discussion is a mixture of FACT (SPARC is licensed...), OPINION ("I think"), and FACTOID ("Right now Cypress...") (well, I will offer a few OPINIONs, too :-) 1) (OPINION): N overlapping chip designs for the same architecture are not necessarily better than one really good design, for the same reason that having N software teams doing the same job may (or may NOT) be as effective as having one really-experienced team doing a project. I observe that there are relatively few world-class, leading-edge, proven, high-performance VLSI microprocessor design teams around, who can do good architecture AND implementation. Anybody who is trying to play in this game, with their own design, but without such a team is probably wasting their time. (FACT) Anyway, there are 2 opposing OPINIONS: a) (SUN OPINION) it's better to have lots of designs, and let them compete. b) (MIPS OPINION) it's better to do one design, get multiple pin-compatible sources for it, plus (maybe) variants of packaging and other attributes as needed. Right now, Adrian's OPINION is just that. I'll suggest below some data points that might help evaluate this OPINION, plus some milestones for the future. 2) Mr. Cockcroft apparently believes the Cypress numbers mean something. 33MHz parts were supposed to be sampling almost a year ago, and certainly were supposed to be in production 3Q/4Q 88. Why do I think this is meaningless? ANS: because you must build a SYSTEM out of this stuff, not just an integer unit. For a UNIX system, in particular, you must buy/build a SYSTEM that runs at the given speed, i.e.: CPU (Integer Unit = IU) MMU cache control cache FPU (and FP controller, if you need a separate one) external memory interface any other glue needed Trumpeting the speed of the IU alone is silly, if you can't get EVERYTHING to run at that speed. Some controller applications might omit the FPU, and have a simplified MMU, of course. We only have one data point: 33MHz Cypress IUs were sampled almost a year ago; nevertheless, I know of NO 33MHz Cypress-based SYSTEMS on the market, delivered in production form to end users. Given that SPARCstation/system 300s (25MHz) have 60-90-day AROs, that means that the approximate date of (25MHz) shipment is something like June/July 1989. (Note these are the small 300s; the big ones are 120-150 days ARO, I think). I'll be glad to hear if anybody is shipping 33MHz Cypress-based systems to update this. Note: Sun: 1) is a fine systems organization, experienced in turning out designs that push technology, and doing it fairly quickly 2) participated heavily in the design/implementation of this chip, so nothing about it should surprise them. There was STILL a year elapsed time between IU samples and shipped product, even by a quick, knowledgable organization. I suspect this means that it will take most others a while longer to get there. I often use a car analogy, with the CPU as engine. It doesn't do you any good to have a high-RPM engine if you can't build the rest of the car to match. FACT: R3000s first sampled about a year ago; at least 2 companies (MIPS & SGI) have shipped 25MHz systems, as early as 4Q88. I suspect you may see a few others soon, in the 20-25MHz range. 3) [FACT] To get back to observable reality, let's look at dates, clock rates, and relative performance of chips in SYSTEMS, using dates of production shipment (not beta, or selected ISVs, but when you as a random customer, get one after you call up and order it.) This reduces the difficulty of comparing things that are shipping with those that are "announced", but you can't get. (Note that in these days of big performance jumps each year, something that is spectacular at one point is ho-hum if delivered 6-12 months later. (EEK! life is harder than it used to be!) Unless you're right in the middle of things, it's hard to tell what's real, and what's just "announced".) To keep this simple, assume that Dhry-mips are Dhrystone mips, and VAX-mips are VAX-relative mips the way MIPS measure them. In both cases, for simplicity here, these are just single-program integer performance ratios, ignoring both floating-point and multi-tasking ones. People can argue with me on the integer VAX-mips assignments (*), but I claim that the numbers in the MIPS Performance Brief show rough parity amongst the systems that I gave similar numbers to, at least on integer performance. (this is not necessarily the case on floating point or multi-user performance, although the later SPARCs have improved at least some on the former. I have no data on the new ones for the latter. All I've got on the newest SPARCs are Dhrystone (1.1, no gimmicks), and DP/SP FORTRAN LINPACK. Still, one can gain some data from these. There's a new Performance Brief that will be out in another week or so, that has all these numbers integrated, so I won't duplicate that here.) OF COURSE, SINGLE-NUMBER MIPS-NUMBERS ARE Wrong Things, but here are the machines generally on leading-edge of performance curve for uniprocessor RISC chips, ignoring other machines done with lower clock rates in meantime: CMOS RISC micro uniprocessors: Sorted by date, with MIPS using numerics and SPARC using alphas: Code Clock Dhry VAX date machine MHZ mips mips 1 12.5 11 8 2Q87 MIPS M/800, R2000+128K cache 2 15 13 10 4Q87 MIPS M/1000, R2000+128K cache A 16.7 11 8* 4Q87 Sun-4/200, Fujitsu+128K cache 3 16.7 16 12 2Q88 MIPS M/120-5, R2000+128K cache 4 25 24 20 4Q88 MIPS M/2000, R3000+128K cache B 25 16 12* 2Q89 SPARCsystem 300s, Cypress+128K cache C 33 20 16*? 4Q89? SunRay, Cypress+?? Now, let's graph this, with uniprocessor performance as shown in earliest production ships of the faster machines: VAX- ---- 1987 ----| ----- 1988 ---| ---- 1989 ----| mips 1Q 2Q 3Q 4Q 1Q 2Q 3Q 4Q 1Q 2Q 3Q 4Q 20 4 18 16 C? 14 12 3 B 10 2 8 1 A 6 4 2 ------------------------------------------------------- Now, one can argue about jiggles of a few months, or a vax-mips, but some useful conclusions so far: a) The A-B-C curve didn't double performance every year. Assuming the SunRay gets out 4Q, it will be two years. b) It is hard to find any evidence in this chart that the 3 different implementations (A, B, C) are providing FASTER systems EARLIER than the 2 related implementations (1-2-3, 4). It is impossible to predict the future, however. c) (OPINION): I don't think the CPU-kit costs for the SPARC implementations cost less than the MIPS ones. This can be seen, for example, in LSI Logic selling (last fall) both SPARC & MIPS CPUs for $10/mips (using the vendor's ratings of mips). R3010 FPUs usually cost 1.5X-2X the corresponding CPU. 12.5MHz R3000s have recently been quoted at $60 in quantity. If you compare against the new Sun products, the MIPS ones always save on the MMU, and cache control, and sometimes save on an FPC (except versus SPARCstation1, which uses the Weitek single-chip solution; the 300's use an FPC+TI8847.) They probably use a few more chips for write-buffers. Anyway, I suspect the cost of the complete kit is usually lower for MIPS, based on counts of the big and/or fast parts in such designs. Both SPARC and MIPS need the same speed vanilla SRAMs at the same clock rate, I think, Given that C (SunRay) increases the clock rate of B by 1.32, unless something unusual is being done, if the performance of B and 3 are comparable, the performance of C will be around 16-real-vax-mips-on-the-MIPS-scale. (In fact, to actually achieve performance scale-up with clock rate, one MUST do something extra, as DRAM latency (in cycles) is worse; presumably, as a high-end, one would expect the SunRay to have ECC (the SPARCsystem300 seems to have parity-only, even when offered as a rack-mount server), and ECC often costs you a cycle. Hence, the SUnRay will ahe to work harder to get the clokc-rate scale-up. Possibilities include bigger caches or deeper write-buffers, for example (if it's a write-thru cache). Another way to put this: if somebody can ship a 40MHz Cypress SPARC, in a production system, this year, they'll catch up with the 25MHz 4Q88 MIPS product.... This illustrates the next rule of RISCars: there's RPM at the engine, and there's speed on the road.... FACT: here's a fairly straightforward head-to-head comparison: Both Cypress CY7C601 and MIPS R3000 are full-custom CMOS, and Cypress says that its process is the best, that "the competition is 2 to 3 generations behind". (quote from Cypress seminar book.) If this is so, the effective performance in a system STILL hasn't caught up with those chips that are in processes "2 to 3 generations behind".... (OPINION: This generation stuff is nonsense, by the way.) FACT: There's another good head-to-head coming up. As has been widely published, Sun is building an ECL system with technology from BIT (Bipolar Integrated Technologies). Although it has NOT been widely published, but has been admitted to, MIPS is also building an ECL system, also working with BIT. As it happens, these two projects actually started about the same time (late 1986), and they use exactly the same technology, so there will be a really fascinating comparison in early 1990. (BTW: related to some earlier discussions in this newsgroup on the structure of the industry, (OPINION) either or both of these VLSI ECL RISCs are going to cause serious trouble in some parts of the mini-super business, as they should have ridiculously low cost/mips for mainframe-class CPU performance, with no special programming required, and with inexpensive and plentiful low-end systems to have gathered software. (OPINION) one must also assume that DG's claim of 100-mips ECL 88K's in 1991 is sandbagging: CMOS/BiCMOS chips will be banging around near that performance level that year also, so ECL should either be earlier or faster.... >For a really nice multiprocessor implementation check out the Cypress CY7C605 >Multiprocessor Cache/MMU (CMU-MP). It is designed by the people who left >the Motorola 88K design team to set up Ross Technology. Cypress gave a series >of seminars earlier this year and there is a good description in their RISC >seminar notebook. (OPINION) The CY7C605 looks like a nice part. (FACT) Note that it is not shipping yet. Note that it seems different from the Reference MMU specified a while back, which is also different from the ones Sun uses in most SPARC-based products. (OPINION): there would be a lot more design wins for SPARC if this part and it's cousin, the CY7C604, had been designed along with the IU. Many wins have been lost over the confusion of different parts and combinations thereof. Presumably there will time for Cypress to make some money on these things before the Sun/TI BiCMOS thing obsoletes them.... Finally, since this whole topic was raised by Mr. Cockcroft, there are are a few facts that he may not have been aware of, and then I have a few questions that I just can't resist: FACT: it is well-known that Sun is working very hard with TI on the next-generation BiCMOS superscalar superchip, i.e., the moral equivalent of the Intel i860 or MIPS R(something). Presumably, Cypress will second-source this. I have no idea whether Fujitsu or LSIL get to second-source it or not; Solbourne is of course working on its own. FACT: the MIPS semiconductor partners were picked, at least partially, to both overlap some (to have true multiple sources & competition), but also to have enough different specialties that they all make money and be able to stay in this game for multiple rounds. Note that our partners don't spend a ton of money designing CPUs from scratch; they spend that money in marketing, support, support chips, or chip variations, yield improvements, etc.....i.e., things that semiconductor companies do especially well. Now, for some questions (my opinions on the answers appear later): Q1: what is the first year in which some other vendor will ship more SPARC-based UNIX systems than Sun? Q2: what is the first year in which all other UNIX vendors put together will ship more SPARC-based systems than Sun? Q3: what is the first year in which some other vendor will ship more SPARC-based systems than Sun? Q4: If you're a semiconductor vendor, and you do a SPARC design from scratch (not second source), you spend a lot of money, especially if you're serious about architecture. What happens if you aren't one of the close-coupled partners for the "next round"? Do you do one on your own to stay competitive, or do you drop out? (OPINION: looks like LSIL is the big winner in the current round, if they are indeed supplying all of the IUs for the SPARCstation1, which is of course, the high-volume product. Perhaps some of the units are from Fujitsu, which otherwise appears not to be supplying ANY of the parts on this round.... maybe next round, or maybe they've got plenty of alternate customers, as Sun-4/[12]xxx go away.) OPINIONS: A1: never. (since, after all, this means than somebody appears from nowhere and beats out Sun at building SPARC-based workstations; Solbourne is clearly the front-runner in this race so far....) A2: probably never. less clear; it certainly won't happen this year. Sun claims it will ship 30K SPARCstations in 1989, 100K+ in 1990, and 240K in 1991. Unless a few more vendors appear quickly, it is hard to believe that any bunch of them will beat this. If you are doing a SPARC workstation design, you'd better be doing something at a different design point than the SPARCstation1, because it's a nice design, and you're unlikely to beat Sun on volume (and therefore, probably on cost). A3: maybe sometime, since presumably Xerox will get them into printers or copiers. Presumably there are some more embedded applications lurking around out there, although we haven't seen them much in that domain, atlhough enough of it is indirect that we might not anyway. (Perhaps the 29K or i960 folks might care to comment; or the Sun folks to mention embedded designs that are public.) We REALLY won't know until products are actually announceed and shipped (as some of the companies listed as SPARC-committed are not as committed as all that.... so it is very hard to know what to believe of "design-win" counts.) A4: I wouldn't be surprised to see one (I don't know which) of the partners drop out by next year, at least on doing independent new designs. SUMMARY: 1) Objectively, Sun has a fine 3rd-party catalog. One does need to do more than counting total applications, and I'll see if I can dig out some relevant info in the next week or so. (that's a whole separate discussion, and this is already long enough.) 2) The OPINION that it is better to have a whole bunch of overlapping designs going on, is a OPINION. So far, I haven't seen much evidence to support that opinion. Maybe there will be in the future. We'll see how the ECL war comes out; we'll see when CMOS SPARCs catch R3000s in performance in a delivered system; we'll see when we get the comparison of the super* chips in 1990. 3) Note that this whole discussion started when somebody asked for some objective criteria for comparing RISCs....objective criteria, not marketing OPINION. -- -john mashey DISCLAIMER: UUCP: {ames,decwrl,prls,pyramid}!mips!mash OR mash@mips.com DDD: 408-991-0253 or 408-720-1700, x253 USPS: MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086