Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!rutgers!gatech!hubcap!jerbil
From: jerbil@csvax.caltech.edu (Stainless Steel Gerbil [Joe Beckenbach])
Newsgroups: comp.parallel
Subject: Re: A possible Linda application, Re: parallel make
Keywords: parallel simulation
Message-ID: <5600@hubcap.clemson.edu>
Date: 25 May 89 12:10:36 GMT
Sender: fpst@hubcap.clemson.edu
Lines: 126
Approved: parallel@hubcap.clemson.edu


	Larry Mellon writes a quick overview of TimeWarp-type optimistic
simulators.  This is not to jump on him at all about this  (it's not 
his fault :-) but to voice the opinion of a current 'detractor'.

>    To people of the 'conservative mechanism' ilk: I concede there are
>applications better suited for conservative synchronization than TimeWarp, 
>so please, no hate mail.

	No problem by me, Larry.  If there's any hate mail about this
posting, send it to me.  Better yet, answer my objections [or what you
perceive as my objections].


In his article <5573@hubcap.clemson.edu> Larry Mellon writes:
>    Jefferson's TimeWarp is intended to solve the problems of synchronizing
>simulation time across multiprocessors in the operating system, rather
>than in the user's code.  It is best suited to descrete-event simulations;
>generally, the more compute time per event, the better your speedup.

	It seems to me that a discrete-event simulator need not have
the user concern himself with simulation time at all, other than to
print out its value in reports partway through the simulation process.
The speed-up comes from assuming that one processor is given only one 
simulator node to work with, is the impression that I get from reading
some papers and talking with the people here, including one who gave one
of the few pro-conservative-simulation results in the Miami conference
on simulation (1989 Distributed Simulation Conference).


>    Basically, a simulation consists of entities which pass events
>between themselves, and take some action when an event is received.
>Each event has a time-stamp, which is when that event is to occur 
>in simulation time.  An entity acts on each event as soon as it is
>received, *with no attempt to verify that the event is in correct
>simulation time order*.  This is called 'optimistic synchronization'
>by supporters, and 'silly' by opponents.  Briefly, the rational is that if
>you verify the event order, you are wasting potential execution time
>and clogging your communication links with pointless queries.  
>    "But, but, you've wasted processing time on event X!" stammer
>the distractors.  Well, not really.  If we had remained blocked until
>X-10 had arrived, that processing time was 'wasted' anyways.  If fact,
>the worst case of TimeWarp, where all events arrive out of order, is
>exactly equivalent to the 'conservative' synchronization method;
>where entities remain blocked until event order has been verified.
>    "And what is this 'rollback'?" you ask?  That's a tricky one.
>While an entity is excuting, snapshots of it's state are taken from
>time to time.  A snapshot consists of all static and automatic variables
>and events sent or received by the entity.  A rollback consists of
>restoring an entity to one of it's previous states.

	Translation: project as far into the future as you can, while
still being able to save all your state information.  If you receive 
something 'out-of-order', step back to the right spot, process the new
event, and start processing all events 'after' it but received before it.
This entails saving all the state information and the timestamped event
notifications, using a very large amount of memory if you wish to be sure
that your simulation is valid.
	One paper reputedly at the conference shows the type of simulation
which fits this set of constraints well:  a simulated pool game, with 
a processor assigned to a square area of the table.  In my honest opinion,
this `toy' shows the best way to use TimeWarp:  absolute minimal changes in
state [eg a billiard ball hits another, or a bumper, or changes processors
only occasionally compared to sitting still or rolling unimpeded].  I'm
not so sure that this is a livable constraint for most simulations.  For 
instance, gate-level VLSI simulations lose in TimeWarp.  [Not only that,
but the potential parallelism would not be realized except minimally.]

	The notion of one blocked process wasting processor time implies
strongly that TimeWarp is based on a one process per node model of computation.
That constraint keeps potential parallelism potential, and as such is not
of any interest to me.
	Again, my honest opinion, TimeWarp seems to be doing half-measures in
parallelizing a simulation:
 * If it needed to be parallel in only a few pieces, why not let the user
	write the synchronization, or supply libraries to support it?  Surely
	the interactions can be understood well enough at that level.
 * If it can parallelize to many pieces, why not go the whole hog and multi-
	process on the different physical nodes [either truly, or via light-
	weight processes]?  This hides the communnication overhead, and answers
	the objection about 'wasting processor time'.


>    I will go out on a limb and state that current research shows that
>optimistic beats out conservative in the general case.

	And I will go out on the other limb and state that current research
I have seen, including Wen-King Su's paper (presented at Miami, titled
"Variants of the Chandy-Misra-Bryant Distributed Discrete-Event Simulation
 Algorithm"), would demonstrate that conservative simulation is more
widely usable than optimistic simulation, given a range of machine hardware,
distributed environments, and simulations.
	My caveats and speculations on this particular limb:
C1)  I know of no paper directly comparing two different implementations of a
	collection of simulations, or even a single simulation, where one is a
	conservative and one an optimistic, with hard result such as run times,
	simulation sizes, data variation, usage of resources, and agreement of
	results with a standard sequential implementation of that simulation.
S1)  The more inherent parallelism a problem has, the less amiable it will be
	to optimistic simulation due to each event-causing 'object' being an
	independent source of events not in the predicted stream of a TimeWarp
	process.
S2)  My experience with simulation is on hardware running sequential and
	message-passing operating systems under the simulation user code.
	TimeWarp might indeed be much better suited for shared-memory machines,
	but then physical limitations keep the number of parallel physical
	nodes to some low number.

References to performance comparison papers, or even good papers on the
implementation of TimeWarp, are welcome!  The definitive experiment would
pit several implementations of simulations over this matrix:
	[ sequential, conservation, optimistic ] x
	[ qualitatively-varied suite of simulations ] x
	[ machines on which all implementations can run ] x
	[ measures of performance (time, result correctness, etc) ]

		Joe Beckenbach

PS Are there any good grad schools out there with professors interested in
	simulation in general, preferably with a Software Engineering option
	and no firm entrenchment of either optimistic or conservative
	simulation?  I'm considering what I want to do with my next few years. :-)
-- 
Joe Beckenbach			   |	
jerbil@csvax.caltech.edu	   |	This plot for cultivation of
Caltech 256-80, Pasadena CA 91125  |	a fertile imagination.