Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!swrinde!zaphod.mps.ohio-state.edu!brutus.cs.uiuc.edu!ux1.cso.uiuc.edu!ux1.cso.uiuc.edu!aglew From: aglew@dwarfs.csg.uiuc.edu (Andy Glew) Newsgroups: comp.arch Subject: Re: Synchronization primitives and cache coherence Message-ID: Date: 14 Jun 90 01:11:45 GMT References: <1360003@aspen.IAG.HP.COM> Sender: usenet@ux1.cso.uiuc.edu (News) Organization: University of Illinois, Computer Systems Group Lines: 63 In-Reply-To: aglew@dual.csg.uiuc.edu's message of 13 Jun 90 18:38:53 By the way, I am maintaining the paper "A Survey of Synchronization Primitives" from which the previously posted comparative tables are extracted. "Maintaining" means that I am trying to keep it reasonably up to date and complete; ideally it would get distributed once every year or two with the latest updates. If you know the details of the synchronization primitives of systems that are not in the tables or the paper, or if I have made any errors, I'd appreciate learning about them. [Ideally] Send me a technical reference manual for the system cpu and/or cache and/or bus and/or memory? I need all or most of these because the aspects that I'm interested in involve interactions between all of the components. (Eg. many 68000 based *systems* do not implement CAS) But any single manual helps. (Well, you can't blame me for trying). [Less ideally] Tell me where to send for such a reference. (If it costs money I'm unlikely to be able to afford it, but the address or phone may be useful. I'm getting good at begging for free info. If you had a copy to lend I'd send it back to you when I'm done) [Least ideally] Give me the scoop yourself? I usually don't quote information received by hearsay or email, because I've found it t be less than reliable - even when the engineer who designed the part is talking! But your description might (1) sensitize me to things I should be looking for if I get the real stuff, and (2) tell me about things that aren't in the documentation. I'm particularly interested in: a) what the atomic instructions actually are b) what bus transactions are actually produced c) are the bus transactions split? d) how is atomicity maintained? is the bus exclusively locked throughout? or is there a lock maintained at the memory controller? e) does your atomic operation use the processor cache, or just bypass it? does it invalidate other caches? does it invalidate its own cache? f) are there conditions where the atomic operation can short circuit without going through a full RMW? after reading the cache? after going to the bus? g) how are bus transactions scheduled? h) once a processor requests the bus for the atomic operation, can it abandon its request? (The last point is important. If you can abandon a pending bus request the time for a lock transfer in a test-and-test-and-set spinloop goes to O(1) from O(n), or even O(n^2) if the bus scheduling is, eg., fixed priority, where n is the number of processors). -- Andy Glew, aglew@uiuc.edu