Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!zaphod.mps.ohio-state.edu!sdd.hp.com!spool.mu.edu!munnari.oz.au!yoyo.aarnet.edu.au!sirius.ucs.adelaide.edu.au!hydra!francis
From: francis@cs.ua.oz.au (Francis Vaughan)
Newsgroups: comp.os.mach
Subject: Re: Efficient copying from task to task
Message-ID: <3335@sirius.ucs.adelaide.edu.au>
Date: 21 May 91 10:53:33 GMT
Article-I.D.: sirius.3335
References: <1991May17.074526.19128@sics.se> <3319@sirius.ucs.adelaide.edu.au> <1991May20.185351.17114@ecrc.de>
Sender: news@ucs.adelaide.edu.au
Reply-To: francis@cs.adelaide.edu.au
Distribution: comp
Organization: Adelaide Univerity, Computer Science
Lines: 120
Nntp-Posting-Host: hydra.ua.oz.au


In article <1991May20.185351.17114@ecrc.de>, veron@ecrc.de (Andre Veron)
writes:
|> 

|> We then put some hope in the lazy copying of MACH.  The problem then
appeared
|> to be that MACH as well as UNIX is not designed for applications which
|> need intensive forking of "threads" of computation which have their own 
|> separate adress spaces. The task in MACH  is a coarse grain entity which
|> have a whole bunch of available facilities like files, communication
|> port which are not always needed but are always there. A task is
consequently
|> costly to create, to schedule and to terminate.

It is probably unreasonable to saddle Mach with the blame for the
cost of fork etc. It is a bit sad that the call you really need
(task_create()) is not implemented in 2.5 and derived systems.
However this is more a reflection of Mach as a development/research
system than anything else.

Many of the overheads you accuse tasks of are not nessesary for the
task, but rather part of the stuff added to make a task look like a
Unix process.

........

|> 
|> What we ended up with is a proposal for a new kind of thread/process
|> which we believe is implementable in an operating system and fits more
|> our needs. These threads/processes are executed in the same adress space
|> except in some precise regions that they personnaly own and which are copied
|> (lazily) from the parent at creation time. The virtual space
|> appears then to be "locally layered":

I guess a lot of us have wished for just a little local address
protection for a thread within a task. Your suggestion has merit.
Intuitively the cost would be somewhere between raw threads in the
same address space and full tasks. See later.


|>  Within a quantum of time allocated
|> to a global "task" (not a MACH task any more) context switching between
|> these threads/processes is cheap - the cost is the one of a unmapping 
|> the pages of one thread/process and mapping those of the next one.
|> Since the concept of private region is hardwired in the paradigm
|> all virtual memory handling can be done at forking
time/context-switching time
|> when the system is in kernel mode. 

|> No additional and unelegant system calls
|> are need to set up  the execution environement of a newly created/scheduled
|> thread/process. 

Well no more than there are already. Someone must define those
memory areas and the appropriate attibutes. Not a lot different to
vm_inherit().

A few thoughts come to mind. 

I guess a lot of us have  wished for just a little local address
protection for a thread within a task. Your suggestion has merit.
Intuitively the cost would be somewhere between raw threads in the
same address space and full tasks. The whole thing could be more
powerful than you suggest.

The cost of conventional context switching is not high in terms of
work directly done to bring the context switch about. Rather it is
the invalidation of caches and PTLBs that cause pain as the new
process gets started again.  Your suggestion would actually involve
a lot more code in the context switch than is currently needed, but
you would hope to gain with little or no PTLB and cache wrecking.  

Luckily on multis there would be no need for PTLB shootdown, (Machs
current answer to brain dead MMUs that don't have coherent PTLBs)
as the changes to the  the memory map are per thread and hence
cannot be of consequence to other processors.

Your proposal would be very heavily dependant upon the underlying
architecture for effeciency. You would need an MMU capable of
selective PTLB invalidations, otherwise you would need to kill the
whole PTLB which would make the context switch just as expensive as
a full task switch. Most MMUs with PTLBs have this facility.  

The same goes for data cache. If it caches physical addresses there
would be little problem. If it was a virtual address cache life
would be very much harder. You could never invalidate all the
appropriate entries in anything like the time that a full cache
refill would take, so a complete invalidation would be the cheapest
way out. Again no gain over conventional tasks.


Life might get unbeliveablely interesting if you wanted to use a
machine with an inverted page table and still keep your caches
alive. (However no experience with these so I won't opine further.)

|> Moreover the resources (physical pages) allocated to a terminated
|> thread can be kept by the "task" and ready to be allocated to the
next created
|> thread/process.

I think you would buy an argument with the kernel over who gets
first pick of the physical pages. Nobody gets allocated physical
pages like this, you get the use of them for as long as you have a
good claim, and often not even that long. You most certainly never
get the option of hanging on for a rainy day. It's simply not part
of the abstraction.


Personally I would use such a facility and I suspect many others
could make use of it too. However I don't belive it is a goer
because a lot of existing hardware would make it uneconomical. It
would never work on any current Suns for instance. Other
achitectures would be fine, Mutimax would be no problem. As virtual
addressed caches die out it may well catch on.  


More lies and drivel from.....
					Francis Vaughan