Path: utzoo!utgpu!bnr-vpa!bnr-fos!bnr-public!schow
From: schow@bnr-public.uucp (Stanley Chow)
Newsgroups: comp.arch
Subject: Re: Multi-Processor Serializability
Summary: Read the problem statement, it is not trivial:
Keywords: data ordering, coherence, shared memory multiprocessing
Message-ID: <272@bnr-fos.UUCP>
Date: 2 Feb 89 22:34:11 GMT
References: <3492@cloud9.Stratus.COM> <1066@cmx.npac.syr.edu>
Sender: news@bnr-fos.UUCP
Reply-To: schow@bnr-public.UUCP (Stanley Chow)
Organization: Bell-Northern Research, Ottawa, Canada
Lines: 38


The problem that Tom posted refers to the problem of compilers doing
optimizations that are correct for a single process but becomes wrong
only when multi-processing.
 
Specifically, if x and y are declared in shared memory (but not volatile),
then it is legal for the compiler to reorder writes to them. All 
semantics can be preserved (if the compiler did it right). The problem
comes from a second process looking at y and making inferences about
the value of x. Reordered writes becomes a problem.
 
Note that using 'volatile' is one possible solution but is incomplete
in two ways:
  - if all my data is shared and I put volatile on everything, it
    can be a major major performance hit.
  - the ANSI definition of volatile is intended for dealing with
    I/O registers and such (at least what little I know of it). There
    can be circumstances when reordering of shared memory and my
    own process memory can cause errors. (I am not saying this is
    good or desirable, just that there are systems with this attribute).

The more aggressive optimizations mentioned where program semantics can
be violated momentarily (due to wrong branch prediction and or vector
performance but fixed later) can be very nasty if an interrupt comes
at the wrong time and the interrupt handler (or scheduler, or trap
handler) was written knowing certain ordering of the memory references.

Preemptive flame retardent: I am not defending this style of coding.
But reality is that there exist large software bases that were written
relying on this ordering (perhaps because the software is written 
before any body knew how to multi-processing).
 

Stanley Chow  ..!utgpu!bnr-vpa!schow%bnr-public
              (613) 763-2831
 
RISCy: We already know the ultimate RISC (the single instruction CPU)
       is a low performer. Why does anyone want to be RISCy?