Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!uflorida!gatech!hubcap!matloff
From: matloff@iris.ucdavis.edu (Norm Matloff)
Newsgroups: comp.parallel
Subject: Re: Distributed simulation
Message-ID: <5319@hubcap.clemson.edu>
Date: 28 Apr 89 12:11:47 GMT
Sender: fpst@hubcap.clemson.edu
Lines: 65
Approved: parallel@hubcap.clemson.edu

In article <5312@hubcap.clemson.edu> wagner@june.cs.washington.edu (Bullwinkle J. Moose) writes:

>While it's true that running parallel independent replications has a
>lot of appeal, there are some mitigating considerations.  The first and
>most obvious of these is that beyond a certain point, the marginal
>effect on confidence interval width of increasing the number of
>replications becomes insignificant.

This can not be literally true. The accuracy will be roughly inversely 
proportional to the square root of the number of replications, no matter
how many replications there are.  If you quadruple the number of
replications, you should always get double the accuracy (except for
init. bias).

Perhaps you meant that after a certain point, any further decrease in
CI width are of marginal value to the USER?

>Second, because of the presence of an initialization bias in any
>stochastic simulation, it is important to run each replication long
>enough to ensure that the confidence intervals that one obtains for
>some estimator actually contain the true value of what is being
>estimated.

Yes, as I recall, that was the theme of Heidelberger's paper.  But
it seems to me that it's not so simple.

For example, in typical applications, a number of runs will be made
of the same simulation program, with different parameters, e.g.
different arrival rates.  After doing the first run, the user has
a pretty good idea of the steady-state queue lengths, and can
initialize the other runs with nonempty queues of those approximate
sizes.  This should reduce the initialization bias quite substantially.

Or, better yet, in systems which have regeneration points and don't
have huge (in the notation from my posting yesterday, >> T/P)
regeneration cycle times, one can simply start each replicate at
the regeneration point  --  thus no initialization bias at all.

Of course, you do run into the problem you mentioned, i.e. the
problem of "waiting for the slowest one," which gave you your
P/log P figure.  However, analogous kinds of problem should plague
the task-partitioning approach too, e.g. as P increases, the
problem of idle time should get worse and worse.

The task-partitioning approach also has a scaling problem.  For
a fixed application, say a 10-server queue, there will be a very
small limit on the usable value of P.  After a certain point,
you just can't find anything for the processors to do.  On the
other hand, the replicates approach has this too, i.e. for a
fixed problem, a large value of P exacerbates either the
initialization bias problem or the "wait for the slowest one
problem."

So I don't think that either approach is a clear win.  But the
replicates approach certainly is much simpler.

>Finally, a pragmatic consideration against running multiple independent
>replications in parallel is that some simulations have enormous memory
>requirements,

In most simulations of queueing systems in practice, memory should not
be a problem.  The length of the event list should be on the order of 
magnitude of the number of servers, which is generally not large.

   Norm