Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!shadooby!accuvax.nwu.edu!tank!shamash!nic.MR.NET!srcsip!csd4.milw.wisc.edu!lll-winken!uunet!portal!cup.portal.com!mslater From: mslater@cup.portal.com (Michael Z Slater) Newsgroups: comp.arch Subject: 486 and 68040 Message-ID: <17131@cup.portal.com> Date: 14 Apr 89 03:53:35 GMT Organization: The Portal System (TM) Lines: 92 > query about comments on 68040 and 80486 It is curious that the net has been so quiet about these new processors. Here's my perspective: The 040 and the 486 are very similar in approach. Both chips use on-chip caches of 8 Kbytes, include snooping for cache coherency, have on-chip floating-point coprocessors, use pipelining, bypass gates and other tricks to reduce the average clocks/instruction, and are supposed to be fully compatible with their predecessors. Both are claimed to be 2.5 to 3 times as fast as the previous versions (030 or 386) at the same clock rate. As to which is faster, there's just not enough public information to call this one. Intel has not yet released any real performance data. They have quoted 37,000 Dhrystones and 6.1 MWhetstones at 25 MHz, and 15 to 20 VAX MIPS. These figures, however, are from simulations. I'll remain skeptical until we see some measured data. As for the 040, Motorola has not formally introduced the part; they have made only an "architectural announcement", and have witheld all details. (They don't even acknowledge the cache sizes officially.) Until the formal intro this fall, any real evaluation will be impossible. The differences between the two chips are in several categories: - The inherent differences in the efficiency of the instruction sets, which are essentially the same as their predecessors. This is essentially a religious argument, which is probably pointless to pursue. - The degree to which clocks per instruction has been reduced. Intel's 486 provides single-clock loads, stores, and moves. Assuming a cache hit, data can be used by the instruction immediately following the load, with no stall cycle at all. It remains to be seen if the 040 will do this. - Cache architecture. Intel uses an 8K bytes unified cache, which allows them to support self-modifying code, which is quite common in MS-DOS, Windows, and OS/2 software. (No groans, please - despite the desirability or lack thereof of this programming technique, it only makes sense for Intel to support all the existing code.) Motorola, on the other hand, uses separate 4K caches, which may be less efficient due to the fixed partitioning. Intel avoids the bandwidth problem by using a 128-bit bus to read 16 instruction bytes at a time from the cache, and give priority to data accesses. - Multiprocessor support. Both processors will provide snooping. There are several issues about second-level cache support, etc., which we cannot compare until Moto releases full details. The two chips will not compete head-to-head in many instances. Obviously, PC clone vendors will use the 486, and Apple will use the 040. Vendors of 030-based Unix workstations are likely to use the 040, and Sun has said that they will use the 486. (I don't think Sun has said one way or the other about their plans for the 040.) HP has committed to using the 040. Motorola has an edge in the workstation market because there is by far more workstation software for the 68000 architecture than for any other. However, ISVs are rapidly porting to RISC architectures and to the 386 architecture. Furthermore, Intel has a very strong edge in being able to run DOS and OS/2 software very quickly, in a Unix window if desired. Incidentally, it was striking how much Intel emphasized the 386 at the 486 announcement. As part of the same event, they announced the 33-MHz 386 and a (slightly) lower-power 386SX. They drove home the point, over and over again, that the 486 was a 386-architecture device, and that all software written for the 386 will run on the 486. Application programs written for the 486 will also run on the 386. There is one new user-mode instruction, which swaps the byte order of a 32-bit word, but few programs are likely to use this. Operating system kernels will have to be modified for the 486, to support additional bits in the page tables for cacheability control, plus to do things like set the control register bit that enables the cache. There are a couple new instructions, like compare-and- swap, to support multitasking and multiprocessor operation; these instructions will also be used in 486 kernels. The lesson of both of these processors is that CISC can catch up to RISC performance, it just takes a while. I think RISC will stay a step ahead, but CISC is not toppin out. To drive this point home, Intel announced an agreement with Prime Computer to develop an ECL implementation of the 486 architecture, which Intel will sell as a module. (Think they said 10" x 3" x 3", and 120 MIPS, in 1992.) In what seemed like a joke, but wasn't, Intel Pres Andy Grove said that in they year 2000, they would be able to make a processor with tens of millions of transistors (I forget exactly how many he said), and some thousands of MIPS --- which will be fully compatible with the 386. (The cynics among us might also point out that such as processor will be architecturally compatible with the 8008.) Michael Slater, Microprocessor Report 550 California Ave., Suite 320, Palo Alto, CA 94306 415/494-2677 fax: 415/494-3718 mslater@cup.portal.com