Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!swrinde!zaphod.mps.ohio-state.edu!brutus.cs.uiuc.edu!ux1.cso.uiuc.edu!ux1.cso.uiuc.edu!aglew
From: aglew@dwarfs.csg.uiuc.edu (Andy Glew)
Newsgroups: comp.arch
Subject: Re: Synchronization primitives and cache coherence
Message-ID: <AGLEW.90Jun13201145@dwarfs.csg.uiuc.edu>
Date: 14 Jun 90 01:11:45 GMT
References: <1360003@aspen.IAG.HP.COM> <AGLEW.90Jun13163815@lasso.csg.uiuc.edu>
	<AGLEW.90Jun13183853@dual.csg.uiuc.edu>
Sender: usenet@ux1.cso.uiuc.edu (News)
Organization: University of Illinois, Computer Systems Group
Lines: 63
In-Reply-To: aglew@dual.csg.uiuc.edu's message of 13 Jun 90 18:38:53

By the way, I am maintaining the paper "A Survey of Synchronization
Primitives" from which the previously posted comparative tables are
extracted.
    "Maintaining" means that I am trying to keep it reasonably up to
date and complete; ideally it would get distributed once every year or
two with the latest updates.

If you know the details of the synchronization primitives of systems
that are not in the tables or the paper, or if I have made any errors,
I'd appreciate learning about them.

[Ideally]
    Send me a technical reference manual for the system cpu and/or cache
    and/or bus and/or memory?  I need all or most of these because the aspects
    that I'm interested in involve interactions between all of the
    components. (Eg. many 68000 based *systems* do not implement CAS)
    But any single manual helps.

    (Well, you can't blame me for trying).

[Less ideally]
    Tell me where to send for such a reference.  (If it costs money I'm
    unlikely to be able to afford it, but the address or phone may be
    useful.  I'm getting good at begging for free info.  If you had a copy
    to lend I'd send it back to you when I'm done)

[Least ideally]
    Give me the scoop yourself?  I usually don't quote information
    received by hearsay or email, because I've found it t be less than
    reliable - even when the engineer who designed the part is talking!
    But your description might (1) sensitize me to things I should be
    looking for if I get the real stuff, and (2) tell me about things that
    aren't in the documentation.

I'm particularly interested in:

a) what the atomic instructions actually are
b) what bus transactions are actually produced
c) are the bus transactions split?
d) how is atomicity maintained?
    is the bus exclusively locked throughout?
    or is there a lock maintained at the memory controller?
e) does your atomic operation use the processor cache,
    or just bypass it?
    does it invalidate other caches?
    does it invalidate its own cache?
f) are there conditions where the atomic operation 
    can short circuit without going through a full RMW?
    after reading the cache?
    after going to the bus?
g) how are bus transactions scheduled?
h) once a processor requests the bus for the atomic operation,
    can it abandon its request?

(The last point is important. If you can abandon a pending bus request
the time for a lock transfer in a test-and-test-and-set spinloop goes
to O(1) from O(n), or even O(n^2) if the bus scheduling is, eg., fixed
priority, where n is the number of processors).


--
Andy Glew, aglew@uiuc.edu