Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!utgpu!utcsri!utegc!utai!anton From: anton@utai.UUCP Newsgroups: comp.sys.ibm.pc Subject: Re: 386 in IBM AT? Message-ID: <4022@utai.UUCP> Date: Thu, 6-Aug-87 13:49:34 EDT Article-I.D.: utai.4022 Posted: Thu Aug 6 13:49:34 1987 Date-Received: Sat, 8-Aug-87 07:30:24 EDT References: <1272@killer.UUCP> <786@unccvax.UUCP> Reply-To: anton@ai.UUCP (Anton Geshelin) Organization: CSRI, University of Toronto Lines: 92 Keywords: 386, 20mMHz Summary: In article <786@unccvax.UUCP> cbenda@unccvax.UUCP (carl m benda) writes: >In article <1272@killer.UUCP>, robertl@killer.UUCP (Robert Lord) writes: >> Intel is coming out with a 20Mhz (whew!) 80386 chip later this year, and >> I would like to drop it into me AT to make it into a fast desktop. If I >> did this, would I have any problems with existing software/hardware? >> Would I be able to use software designed for the 386? Would all the > > >No No and Finally No. The problem with what you described is the fact that >your AT was probably designed to run at NO more than 10MHz. This is because >the memory is probably no faster than 120ns RAM. What this means is that if >you replace your 80286 with a 80386 inserted into a Chetah Adapter, it will >perform at a MIPS rate which is no better than your current configuration. > Since when did MIPS rate directly depend on the memory speed? This is not true for highly popelined architectures like the 386. Besides MIPS is not a good test of system performance. >What you will gain is the addressing capabilaty of the 80386, namely, 4G >bytes, and 1Meg segments, instead of 80286 which is 16Meg and 64K segments. !!!! 386 supports 4G segments. Are you one of those other uP lovers trying to proliferate myths about 386? >With this in mind, a 20Mhz 80386 chip would be wasted in your machine. > The problem with today's fast CPU's is that there is no cheap RAM to keep up with them. Hence, a cache is necessary so that memory will not degrade CPU performance. Since, an average instruction on the 386 takes longer than 1 bus cycle(=2 CPU cycles) it is possible to fetch the next instruction while the first is being executed using the pipelines in the 386. Hence, a 20Mhz CPU will not be a waste, it will still execute complex code faster while waiting for the memory a lot. The biggest problem with putting a 386 without a cache into an AT board is that the AT board has a 16 bit data bus. Therefore, each 386 32 bit fetch will take 2 bus cycles. A smaller problem is that most AT boards have slow (120nsec) memories and slow address decode logic. The biggest advantage of the 80x86 processors is their fast memory cycle. Each memory fetch requires only 2 CPU cycles vs. 3 or 4 for other uP's. However, to solve the memory problems above, the pipelining of in the CPU provides the address of the next fetch about 1/2 cycle before the next bus cycle. This allows for address decoding while the information from the previous fetch is on the bus. >What you would really like to do is keep everything in your AT but your >mother board. Replace the mother board with an 80386 motherboard which >will allow you to have a 20Mhz machine with 32 bit word memory moves, your >AT can only do 16bit memory moves. Here again memory for this machine will >be exspensive, I believe with a 20Mhz clock rate the memory needs to be >at least as fast as 70ns. This is a guess since my ALR 386 uses 80ns ram >and has a 16Mhz clock rate. > The solution suggested here is definitly the best in terms of performance. But some caution is advised. Consider a 20Mhz CPU. From above we get the maximum time to decode an address and fetch a word on the bus as 2.5cycles*50nsec clock=125nsec without wait states. This is not enough, even with the fastest decode logic and 80 ns RAM. In fact this is about half the time that is actually needed.(Have you ever wondered why there are no 0 wait state AT clones faster then 10Mhz). The cheapest solution is to let the 386 run with lots of wait states (2?). The next chepest is to use special RAM. We will call this the Compaq solution. Here you can address a whole bunch of words *in sequence* very quickly (i.e. no address decode neccessary). With pipelining in 386 this probably works out to an average of 1 wait state per bus cycle.( What is the biggest chunk of inline code and data that you have written?). This is where Motorola's idea of having separate Instruction and Data caches pays off. The best solution is to have a large (64K) fast (35ns) cache. The difference between 100ns and 80ns main memory RAMs becomes insignificant. The memory cycle as seen by the CPU becomes (assuming 80% hit ratio) 80%*35ns+20%*100ns=48ns which is barely acceptable at 20Mhz. Because of 386's pipelining the improvment over the Compaq solution is about 20%(as suggested by Intel literature and determined by tests of cache based 386 machines vs. Compaq in recent issues of Byte). Caches are expensive but so is nibble mode RAM for COmpaq. I hope that you realize what the moral of the story is. When you buy a 386 or any other fast CPU machine it is the memory sub- sytem that you are paying for. Consider a cached 386(as above) in a 10 MHz 0 wait state AT bus. AT bus cycle is 2.5cycles*100ns/cycle=250ns. 32bit cycle on the AT bus will be 500ns i.e.2 fetches. From the formula above the 386 cycle will be 80%*35ns+20%*500ns=128ns or almost good enough to run with 1 wait state. >hope this helps >/Carl >...decvax!mcnc!unccvax!cbenda