Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!zaphod.mps.ohio-state.edu!think.com!mintaka!bloom-beacon!eru!hagbard!sunic!mcsun!ukc!icdoc!qmw-cs!timk
From: timk@cs.qmw.ac.uk (Tim Kindberg)
Newsgroups: comp.os.misc
Subject: Re: process migration - status and availability
Message-ID: <3057@redstar.cs.qmw.ac.uk>
Date: 17 Apr 91 16:54:09 GMT
References: <10422@pitt.UUCP> <16932@chopin.udel.edu>
Sender: usenet@cs.qmw.ac.uk
Lines: 63
Nntp-Posting-Host: aux34

In <16932@chopin.udel.edu> gdtltr@chopin.udel.edu (root@research.bdi.com 
(Systems Research Supervisor)) writes:

>In article <10422@pitt.UUCP> jonathan@cs.pitt.edu (Jonathan Eunice) writes:
>=>
>=>Why isn't process migration common?
>=>
>   Two reasons: 1) It is hard to do in general. There is more to a
>process's state than most people realize, including the kernel state and
>relationships with other processes. There is also the potential overhead
>of copying the entire code and data across a network.

To minimise copying, standard VM techniques can be employed.  Sprite, for 
example, writes dirty data/stack pages to disc and page-faults its address 
space components back in again as necessary at the new site.  Incidentally, 
Sprite exhibits another migration issue related to your first point.  A 
migrated process in Sprite still depends for some facilities on its original 
site, making it vulnerable to that site's failure; and also meaning that it 
continues to impose a certain amount of load there.

Migration is a better prospect on a distributed memory multiprocessor with a 
high interconnection bandwidth (my own kernel, Equus, shows this: 60 
milliseconds to migrate all of a 100K process over a VME bus-based network; 
420 milliseconds for a 1M process).
 
>There are a couple
>papers on the subject that confirm this, but I don't have my bibliographical
>stuff with me.
See Y. Artsy, R. Finkel, 'Designing a process migration facility - the 
Charlotte experience, IEEE Computer, vol 22, no 9, Sep 89, pp 47-56, for info 
and references on a number of designs.

>2) Depending on the system model, you may get comparable performance simply
>by using load balancing when new processes are created and leaving them
>where they go. The only case where I would want process migration is when
>more than one long-running CPU-bound processes are assigned (erroneously)
Erroneously? Who/what knew they were going to be CPU bound?  Or, suppose the 
system put them there because it made sense given the load on the other 
machines at the time?  Having a process migration facility means that, when 
the cross-machine load profile becomes unbalanced due to processes dying or 
entering new phases with different load-related behaviours, you can do 
something about it.

Other reasons for having process migration (apart from withdrawing when 
someone logs on to a previously idle workstation): 1) if two processes start a 
lengthy interaction involving only synchronous communication, migrate one of 
them to the other's site to save network overhead.  2) I don't have virtual 
memory in Equus; if a process attempts to increase its data size and fails for 
lack of memory, it can migrate to another site where there is sufficient 
memory.

I'm interested to hear about other implementations.  In particular, I'm not 
familiar with the AIX one: can anyone give me a reference for that?


--

Tim Kindberg

UUCP:      timk@qmw-cs.uucp                      | Computer Science Dept
ARPA:      timk%cs.qmw.ac.uk@nsfnet-relay.ac.uk  | QMW, University of London
JANET:     timk@uk.ac.qmw.cs                     | Mile End Road
Voice:     +44 71 975 5236 (Direct Dial)         | London E1 4NS