Xref: utzoo comp.arch:9655 comp.sys.intel:803
Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!tut.cis.ohio-state.edu!ucbvax!decwrl!nsc!unixprt!paf
From: paf@unixprt.UUCP (Paul Fronberg)
Newsgroups: comp.arch,comp.sys.intel
Subject: Need information about 386 performance
Keywords: i386, pipelining, clock cycles, performance
Message-ID: <390@unixprt.UUCP>
Date: 10 May 89 16:34:46 GMT
Followup-To: poster
Organization: uni-xperts, Inc. - Unix System and Networking Consultants
Lines: 22

I am doing analysis of the i386 for use as in a controller and am having
problems relating calculated timings with measured timings. Does anyone
know about any documentation, ap-notes, or such that might be available
from Intel that describes the inner workings of the i386, especially how
the various phases of the pipeline interacts.

The measured time and calculated time are very different for several
test code fragments (The measurements were on a 386 Unix box with 0 wait
state memory, 32 bit bus). I suspect I am seeing collisions between instruction
prefetch and instruction memory accesses. Things seem to be very complicated
when the MMU is activated and the memory is not 0 wait state and I am sure
that the pipeline timing diagrams I am generating are anywhere correct.

The hardware and software reference manuals are very skimpy when it comes
to information of this type. The timing information given in the programmers
reference manual gives only execution cycles, assuming that the instruction
has already been prefetched and decoded. There seems no information concerning
the effects of MMU, addressing, etc.

Any help or information would be most appreciated. 

Paul Fronberg