Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!mnetor!uunet!seismo!sundc!pitstop!sun!decwrl!pyramid!prls!mips!hansen From: hansen@mips.UUCP (Craig Hansen) Newsgroups: comp.arch Subject: Re: H/W Write Buffers, S/W Synchronization (long reply) Message-ID: <849@mips.UUCP> Date: Wed, 28-Oct-87 16:58:08 EST Article-I.D.: mips.849 Posted: Wed Oct 28 16:58:08 1987 Date-Received: Sat, 31-Oct-87 14:35:20 EST References: <2280002@hpsal2.HP.COM> <845@winchester.UUCP> Lines: 72 In article <2280002@hpsal2.HP.COM> viggy@hpsal2.HP.COM (Viggy Mokkarala) writes: >I have been looking into a problem involving software synchronization >using shared variables in tightly coupled private cache multiprocessor >systems with hardware write buffers. >The question is not "how to synchronize with write buffers", but rather the >following: > 1. How much code already uses this? > 2. Is it difficult to write software with such a restriction?, and > 3. Would it be appropriate to force software writers to identify shared > variables? The problem you've been looking at is well-known in the multiprocessor environment. The question really is how tightly-coupled (synchronized) two processes on separate processors really are. The mechanism used by the code example only works if writes are highly synchronized, for example, if you ensure that before a write completes, that all writes that appeared on the bus prior to that write are reflected in the state of the processor's cache. Such a synchronization requirement may be violated not only because of write buffers (which permit the processor to continue execution before the common bus sees the write), but also because of cache-invalidate or cache-update buffering (which permit the bus to continue execution before the cache is invalidated or updated due to the bus write). Many multiprocessor systems are being built (I am aware of several being built with MIPS CPUs) that do not have this property, and software which runs on such machines must perform a synchronization operation of some kind before reading a shared variable that may have been written by another CPU. This doesn't mean that all reads of shared variables need to be specially synchronized, nor does it mean that all writes of shared variables need to be specially synchronized; it means that between a write of a shared variable by one processor and a read of the same shared variable by another, that at least one synchronization must occur between those two processors, or between each of those two processors and a common bus. As to whether it's difficult to write software with such a restriction, that's somewhat an open issue. Current research efforts are investigating ways of writing parallelized code without explicit synchronization operations, and some of these methods are assuming rather strong requirements on cache coherence. These strong requirements end up requiring implicit synchronization operations in basic operations, and so are undesirable (because the extra synchronization operations impede buffering and parallelism itself). Identifying shared variables is either very easy or very difficult depending on the application and environment. For example, when a monolithic kernel (such as UNIX) is run on a shared-memory MP, essentially all variables end up shared (when any processor can run any portion of the kernel). Process migration may also permit any variable (even in a single-process application) to be shared between processors. In these cases, normally, an explicit synchronization operation occurs between the write by one processor and the read of another, so while it's hard to get away with explicitly indicating shared variables, the explicit synchronization operation occurs. What would be useful for this sort of hardware/software environment is an indication, not of whether a variable is "shared," because all variables essentially are shared or sharable, but whether such a variable requires implicit synchronization, associated with reads and writes of the variable. I would presume that when such a variable is so indicated to the HLL compiler system, that such "implicit" synchronization can actually be performed explicitly by software, so that the basic hardware mechanism can always use the weaker coherence forms that can be efficiently buffered. -- Craig Hansen Manager, Architecture Development MIPS Computer Systems, Inc. ...decwrl!mips!hansen