Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!uflorida!gatech!hubcap!wagner From: wagner@june.cs.washington.edu (Bullwinkle J. Moose) Newsgroups: comp.parallel Subject: Re: Distributed simulation Summary: When intuition fails... Message-ID: <5367@hubcap.clemson.edu> Date: 2 May 89 12:37:11 GMT Sender: fpst@hubcap.clemson.edu Lines: 125 Approved: parallel@hubcap.clemson.edu In article <5319@hubcap.clemson.edu>, matloff@iris.ucdavis.edu (Norm Matloff) writes: > In article <5312@hubcap.clemson.edu> wagner@june.cs.washington.edu (Bullwinkle J. Moose) (Hey! that's me!) writes: > >Because of the presence of an initialization bias in any > >stochastic simulation, it is important to run each replication long > >enough to ensure that the confidence intervals that one obtains for > >some estimator actually contain the true value of what is being > >estimated. > > Yes, as I recall, that was the theme of Heidelberger's paper. But > it seems to me that it's not so simple. > > For example, in typical applications, a number of runs will be made > of the same simulation program, with different parameters, e.g. > different arrival rates. After doing the first run, the user has > a pretty good idea of the steady-state queue lengths, and can > initialize the other runs with nonempty queues of those approximate > sizes. This should reduce the initialization bias quite substantially. Although your argument seems quite reasonable, it is, alas, untrue. First of all, note that the steady-state of a stochastic process is represented not by a single value, but by a *distribution* of that value. Therefore, what you really need to do is estimate the distribution of the steady-state, and then start up a number of runs, with the intitial conditions chosen at random from that distribution. (Note that if you knew the distribution of steady-state conditions exactly, then there would be no point to doing the simulation at all!) As an example of what happens if steady-state is chosen to be the average value instead of a set of samples from the (estimated) distribution, I refer you to "Statistical Analysis of Simulation Output Data", by A.M. Law, Oper. Res. 31(6):983-1029 (June 1983). In the following excerpt from pages 1015-1017, my own comments are enclosed in [...]: "The suggested procedure [using pilot runs to estimate steady-state] would be easy to apply in the case of an M/M/1 queue, since the state of the system is a single integer-valued r.v., namely, the number of people in the system. In the case of a complex real-world simulation, however, this procedure may require estimating the joint distribution of a large number of r.v.'s, some of which may be continuous. As a simple example of this difficulty, consider a single-server queueing system without exponential interarrival times or service times. In this case, we would have to estimate the joint distribution of number in system, time to the next arrival, and the remaining service time of the customer in service (if any). [ In the following paragraph, Di is defined to be the delay experienced by the ith customer to arrive at the server, and d is defined to be the steady-state expected average delay, i.e. lim(n-->oo) (D1+D2+...+Dn)/n] "Kelton and Law ["The Transient Behavior of the M/M/s Queue with Implications for Steady-State Simulation", TR 82-7, Dept. of MIS, U. of Arizona, 1982] computed E(Di) as a function of i and the number in system at time 0, say, l [lower-case l, not 1] for various M/M/s queues.... Let i(l,0.01) [be] the smallest point [value of i] beyond which all values of E(Di) fall withing 1 percent of d, viewed as a function of the initial number in the system, l. Kelton and Law found for the M/M/1 queue with rho=0.9 that i(l,0.01) [^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^] is minimized for l = 15. This result is interesting since [^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^] the steady-state time-average number of customers in system is 9. [^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^] In general, Law's article is an excellent guide to the pitfalls (his word) of using classical statistical reasoning in the analysis of simulation output data. Not to mention intuition! > Or, better yet, in systems which have regeneration points and don't > have huge (in the notation from my posting yesterday, >> T/P) > regeneration cycle times, one can simply start each replicate at > the regeneration point -- thus no initialization bias at all. True, the method of regeneration theoretically does not suffer from the initialization bias problem. But "theoretically" is the key word here, because unless what you are simulating is truly a regenerative process, your regeneration points are only approximate. And using regeneration would definitely introduce the problem of stragglers (see my last posting), because it might lead to a wide variation in completion time of the replications unless the number of regeneration cycles within each replication is very large. > Of course, you do run into the problem you mentioned, i.e. the > problem of "waiting for the slowest one," which gave you your > P/log P figure. However, analogous kinds of problem should plague > the task-partitioning approach too, e.g. as P increases, the > problem of idle time should get worse and worse. But, since the number of replications being run when intra-replication parallelism is used in less than P, the magnitude of the effect is smaller. I didn't claim that the "tail effect" problem was a major one, only that it "nudges the scale" a bit in the direction of intra-replication parallelism. > The task-partitioning approach also has a scaling problem. For > a fixed application, say a 10-server queue, there will be a very > small limit on the usable value of P. After a certain point, > you just can't find anything for the processors to do. Definitely true, which is why the best approach for a large number of processors is probably a combination of inter- and intra- replication parallelism. > > So I don't think that either approach is a clear win. But the > replicates approach certainly is much simpler. > > >Finally, a pragmatic consideration against running multiple independent > >replications in parallel is that some simulations have enormous memory > >requirements, > > In most simulations of queueing systems in practice, memory should not > be a problem. The length of the event list should be on the order of > magnitude of the number of servers, which is generally not large. Depends on what is being simulated. I have friends at Bellcore who were simulating a metropolitan trunk network in which the steady-state size of the event list was about 15 THOUSAND events. "All generalizations are false, including this one." Dave Wagner University of Washington Comp Sci Department wagner@cs.washington.edu {cornell,harvard,ssc-vax,tektronix}!uw-beaver!wagner