Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!purdue!bu-cs!encore!cloud9!tomc From: tomc@cloud9.Stratus.COM (Tom Clark) Newsgroups: comp.arch Subject: Multi-Processor Serializability Keywords: data ordering, coherence, shared memory multiprocessing Message-ID: <3492@cloud9.Stratus.COM> Date: 31 Jan 89 20:49:56 GMT Organization: Stratus Computer, Inc., Marlboro, MA Lines: 49 In any computer system, the programmer expects operations in the source code to be carried out in the order specified. This is especially true in a multi-processor system where multiple processes may share data. For example, process A may write two locations with x' and y'. Process B, seeing y', assumes x' has been written and reads it. This may be extended causally by process B then writing a third location z' and process C, seeing z', knowing that x' must have been written. If for some reason these writes are performed out of order, the model on which the programmers have based their thinking is wrong. Programmers have been used to such a model for quite a while. By the use of both explicit and implicit locks, they have exploited it to arrive at well-behaved high-performance MP systems. However, today compilers are optimizing and rearranging the order of the operations specified in the program (especially for RISC). In addition, newer high-performance processors will reorder operations within the chip to improve performance by the use of data-driven techniques. Also, newer computers have more complex busses (multiple paths) between the CPUs and memories. The problem of cache coherence also adds complexity to the problem. In the case of explicit locks, the programmer can give hints to the compiler and the hardware in order to force a certain sequence of operations. These hints may take the form of a 20 pound sledge at times (e.g. forcing serial operation of the CPU or disabling compiler optimization in some parts of code) but they will work. The problem comes with implicit locks. An implicit lock is the dependence on the ordering of data references (both reads and writes). These are often very hard to find by inspection, even if one has the time to examine all parts of the source code. Have people thought of how to get older software to work on newer machines and compilers? Obviously older applications and operating systems would like to take advantage of the new technology, but finding all these problems before putting the system into production is very difficult. The class of bugs introduced is such that testing cannot be depended upon to find them, and they will likely occur when the system is very busy (cost of failure is high). I'd like to hear any suggestions for dealing with this problem. Even if you can handle the issue for your own code, how do you train a customer to do it for their code? What techniques (hardware and software) can be applied? Is the problem solvable? - Tom Clark - Stratus Computer, Inc., Marlboro, MA - Disclaimer: My opinions, nobody elses.