Xref: utzoo comp.arch:8746 comp.sys.intel:761
Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!tut.cis.ohio-state.edu!ucbvax!decwrl!sun!pitstop!sundc!seismo!uunet!mcvax!inria!ircam!elind
From: elind@ircam.UUCP (Eric Lindemann)
Newsgroups: comp.arch,comp.sys.intel
Subject: Re: i860 overview (long)
Message-ID: <495@ircam.UUCP>
Date: 13 Mar 89 17:42:22 GMT
References: <807@microsoft.UUCP> <92634@sun.uucp> <13322@steinmetz.ge.com> <340@wjh12.harvard.edu>
Reply-To: elind@ircam.UUCP (Eric Lindemann)
Organization: l'Institut de Recherche et Coordination Acoustique-Musique
Lines: 37

Can somebody clarify the following?

In the "i860 overview (very very long)" w-colin writes:

> Everything's single-cycle, but here's what can interlock:
>
> i-cache miss: given in terms of pin timing, plus two cycles if d-cache miss
> in progree simultaneously.
>
> d-cache miss (on load): again, pin timing, but it seems to be "clocks from
> ADS asserted to READY# asserted"
>
> fld miss: d-cache miss plus one clock

I don't think this is exactly what the Intel literature says. I read
it more like this:

The following "freeze conditions" exist which will cause a delay:

* Reference to destination of load instruction that misses

* fld (load to float register) miss 

In other words, you can fire off a "load" instruction (which must mean
a load to an integer register) and continue executing without delay as long
as you don't reference the destination registers of this "load" instruction.
The fact that there may or may not be a cache miss should only delay the 
availability of the data in the registers without necessarilly interupting
execution. 

A cache miss on an FLD instruction however will apparently always cause a
"freeze", or delay in execution, whether or not the FLD destination register
is referenced by a subsequent instruction.

Can this be? Is there some basic difference in interlock behavior between
integer and floating point register files? If so, this can make a big
difference in througput.