Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!mailrus!husc6!encore!calliope!aral From: aral@calliope.Encore.COM (Ziya Aral) Newsgroups: comp.realtime Subject: Re: Lightweight Tasks Summary: variable weight processes Message-ID: <11843@xenna.Encore.COM> Date: 2 Aug 89 02:02:25 GMT References: <2153@gmu90x.UUCP> <70900004@m.cs.uiuc.edu> <4405@tekcrl.LABS.TEK.COM> Sender: news@Encore.COM Reply-To: aral@calliope.UUCP (Ziya Aral) Organization: Encore Computer Corp, Marlboro, MA Lines: 107 In article <4405@tekcrl.LABS.TEK.COM> scotth@tekcrl.LABS.TEK.COM (Scott R. Herzinger) writes: > >What about additional shades of gray between "heavy" and "light"? > >At the USENIX conference in San Diego last January, someone from Data >General presented a paper about their changes to fork() semantics and the >organization of process context in order to provide something they called >variable weight processes. I don't have the proceedings at hand, so I >can't give details and must disclaim that errors are mine, not the paper's >author's. "Data General"?!!!??! Sheesh... The paper was by myself, Ilya Gertner, Greg Schaffer and Alan Langerman from Encore and Tom Doepner and Jim Bloom from Brown University.... "Data General"! That I can never forgive :-) > >The idea is that process context is partitioned into about several >(about seven, I think) parts, each comprising things that roughly >"belonged together" or had interdependencies such that they couldn't be >separated. A variant of fork() is provided that permitted the caller >to create a new process for which any of the parts could be copied or >shared. A child that copies only the stack and otherwise shares >*everything* is a light[est]weight task. Normal fork() is provided to create >children that copy everything (on-write, of course :-)). There are >lots of combinations in-between. I suppose that some of them are >very useful and some probably aren't interesting or useable at all. > >Instead of adding a new layer of context partitioning (e.g. lightweight >tasks) to UNIX processes, graduated partitioning is provided. Context >management depends on the amount of non-shared context. Lightweight >tasks are available, but within a fairly conventional UNIX framework. > >Hope this is interesting, >Scott >-- >Scott Herzinger scotth%crl.labs.tek.com@relay.cs.net > Computer Research Lab, Tektronix, Inc. > PO Box 500 MS 50-662, Beaverton, OR 97077 This is quite accurate except for "Data General" (sheesh again).... Basically the idea was that multiprocessors change the machine model which the operating system presents. In place of a process implementing a virtual machine roughly corresponding to single user on a time sharing system and with hard partitions (seperate addresses spaces, system resources, etc.) implemented at kernel-level, multi's typically allocate many tightly communicating virtual machines to a single user. Here the issue is resource sharing semantics and the system "weight" of that many processes as reflected by process creation/deletion times, context switch times, and by demands on the kernel non-paging memory pool. MACH tasks/threads represent one approach to this problem by essentially allowing many "threads" to exist within one task. A task maps logically to a traditional process and defines an address space etc., while threads are in effect simple virtual cpus.... I suppose it could be called a virtual shared-memory multiprocessor. One task with one thread is the equivilant of a UNIX process. The nUNIX kernel implemented at Encore took a slightly different tack. Instead of instantiating multiple contexts within a process, we broke up the traditional process control block by removing individual resources (address space, file descriptors, signal handlers, process statistics, etc.) and giving them an independent existence as Resource Descriptors. fork() semantics were then changed to allow either the copying or the sharing of individual resource descriptors at fork() time through the addition of a new resctl() call. Essentially resctl() set the bits in a resource mask which specified the behavior of fork()'s for that particular process. If a process specified that its file descriptors were private but its address space shared, then at fork() time a new fd resource descriptor (a copy of the original) would be created, but a new address space (i.e. new page tables) would not. Instead, the pcb of the child process would simply point to the address resource descriptor (i.e. page tables) of the parent. In the final implementation, only the resctl() call was added and all existing process control calls (fork, exit, wait, etc. ) were retained. Since the default behavior specified all private resources, complete backward compatibility with existing UNIX fork()'s was maintained. Also the need for two different paradigms was avoided. Resource descriptors were managed by assosciating a reference count with each one. Each time a resource descriptor was referenced by a process, the count was incremented and each time such a process died, the count was decremented. When the count hit zero, the descriptor was destroyed. The results with this mechanism have been very encouraging. Process creation times dropped from 35 - 128 ms. (System V.3 - Encore APC machine) for data sets ranging from 0 to 16 megs., to a flat 2.5 ms. for all shared processes. This compares with a creation time of around 10 ms. for MACH threads on the same hardware. Context switch times also have improved significantly while kernel memory requirements drop by over two thirds per process. Best of all, it took only a few hundred lines of code to implement given that we already had a multi-threaded kernel with sharable resources (I admit, it is a big "given"). As a final aside, "variable weight" is probably a misnomer as almost all the weight of UNIX processes (startup, switch, and memory requirements) comes from the need to create, initialize, switch, and maintain page tables. All the others count as "noise" from the standpoint of effeciency although they do provide the possibility of highly flexible combinations of shared and private resources. Hope this helps.... Ziya Aral Encore Computer