Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!tut.cis.ohio-state.edu!ucbvax!ucsd!sdcsvax!ucsdhub!hp-sdd!ncr-sd!ncrlnk!uunet!mcvax!cernvax!ethz!iis!prl From: prl@iis.UUCP (Peter Lamb) Newsgroups: comp.lang.c++ Subject: Some problems in the task class Keywords: task, bugs Message-ID: <810@eiger.iis.UUCP> Date: 2 Apr 89 15:20:02 GMT Organization: Integrated Systems Lab., ETH Zuerich Lines: 258 I have a number of problems with the task class as distributed with AT&T 1.2.1 cfront. Some of these are just plain bugs, for which there are relatively simple fixes, others are real problems with the basic idea of the implementation. I have been thinking about writing this article for a while, but have been waiting for the release of cfront 2.0, which I had hoped may have addressed some of the problems. However, I have just been told by AT&T Unix Europe that cfront 2.0 will be not be out until 3Q89. These problems became apparent while I was trying to understand the task library sufficiently to port it to our Sequent Symmetry (i386 CPU) and to our Alliant FX/80, and having had to fix some of the bugs mentioned below on these machines and on our Suns and VAXen. BUGS. It is to be expected that code like this might be buggy. Implementing the low level of concurrency is tricky. It was the equivalent code in V6 Unix which contained the immortal /* You are not expected to understand this */ comment. The bugs that I know of are: 1) In lib/task/task.c, the macros SETTRAP() and CHECKTRAP() are not defined as part of the machine-dependent setup, although they depend on the machine's stack growth direction, and they are set up for the unconventional stack-grows-to-increasing-addresses direction of the AT&T 3BX. This is easily fixed. 2) The routines task::save() and task::restore() do not save/restore the register state (apart from SP, FP and AP). This means that if you use register variables in tasks (or functions called from tasks) you may get surprising results. A half-fix for this is easy, but a real fix is difficult. 3) The routine _sswap(), used for switching stacks in SHARED mode uses its arguments on the (old) stack, after the new stack has been copied in. There are a few words of slop, so that this code sometimes works, but if the new stack is much bigger than the old stack, the values will have been wiped out (in the VAX and Sun versions, at least). 4) The code for _sswap() in sun_swap.s copies the stack in the wrong direction!! (ie, from the stack base towards increasing addresses for the stack size, instead of from the stack base towards decreasing addresses for the stack size). 5) The code for the SHARED stack mode copies too much data. It copies the whole stack frame of the establisher of the task in and out on each task swap. This doesn't matter normally provided that you never reference data in the establisher's frame while running another task. This is forbidden by the paper describing the task system, but there is a way in which it can happen for a fairly reasonable program: main() { my_task a(......); my_task b(......); thistask->resultis(0); } when a task runs, an old copy of the task is put back onto the area where the `correct' task should be; this causes havoc when the task switches to a new task, since all the data associated with the implementation of tasks a and b will be wrong! The bugs 1, 3 and 4 are easily fixed. A fix exists for 5, but I haven't tried to do it, since simple workarounds exist; eg main() { static my_task a(....); static my_task b(....); } I have a partial fix for 2, which brings us to the real problem. THE PROBLEM There doesn't seem to be a reasonable way to implement a reasonably robust version of the task library in its current form given: 1) The current method of constructing tasks (the task code is the constructor of a class derived from class task). 2) That most implementations of C++ (either through their C compiler or directly in the case of g++) use callee-saves for saving register state. This is independent of problems which may occur if DEDICATED mode tasks overrun their stacks. I will attempt to substantiate this claim... HOW TASKS GET CREATED The task creation code does the following (this is slightly simplified, for the real story, see the code!): 1. Establishes the task environment; creates the task stack and copies in the arguments to the task (and incidently, the data part of the creator's frame), patches the static chain in the stack if the stack is DEDICATED and saves the information needed to (re)start the task. Currently this information is the FP, AP and `this' of the task. 2. The task must now return to the creator of the task without executing the body of the user's constructor. A task constructor gets turned into code which does the following: mytask__ctor() { task__ctor(); /* body of my task constructor code */ } task__ctor() `returns' in two contexts: the `normal' return to the caller of the constructor, and in the context of the task. The `normal' return is in no way normal! The `normal' return must *not* execute any code in the body of my constructor (this is often an infinite loop, anyway!). What is done is that the lowlevel code of task.c adjusts the frame pointer to point to mytask__ctor()'s frame and then does a return. This is the nub of the problem. This means that any registers saved by the startup code of task__ctor() or any of the subroutines along the calling path from task__ctor() to the routine which does this manipulation (typically _swap(), an assembly-language routine) won't get restored, and if the caller of the constructor has any active register variables, they may get wiped out. The problem lies in the fact that the caller of a routine depends on the callee-saves protocol being obeyed by all routines along the way. This protocol is broken by the task library code. Even refraining from using register variables may not help. Modern C (and C++) compilers may do their own register allocation. AT&T cfront should put `this' into a register if it is referenced more than twice (it doesn't, though. I have sent a fix for this to the AT&T C++ people. More about fixes later). g++ will automatically place things in registers even if -O is not used (this is a feature, not a bug). Note: there _is_ a way to save and restore registers correctly once the task has been established. It is a bit of a hack, but it makes the task library somewhat less rickety. However, the problem of maintaining the registers correctly at task creation time is a nasty one, especially if 100% code compatibility is needed. POSSIBLE FIXES The easiest one is to change the way tasks are created. This can be done by adding an explicit operator to the task to detach it. Near-compatibility can then be achieved by a macro: class task { protected: int detach(); // Returns 1 in the task, 0 in the creator public: task(); }; #define TASK if(detach()) class mytask : task { public: mytask(); }; mytask::mytask() { TASK { // task body goes here } } The crucial difference here is that detach() returns normally to mytask::mytask(), which means that the stack is all unrolled correctly. (Unfortunately I don't have an implementation of this). Another approach would be to move the task code out of the constructor: class task : { public: virtual void taskmain(); task(); }; class mytask : public task { void taskmain(); }; void taskmain() { // Task body } This has the disadvantage that the task body can only be passed arguments as data in the object. It has the advantage of not needing to copy a piece of the caller's stack into the stack frame to establish the task. I would be interested in hearing from anyone who has successfully used the task library from cfront 1.2 for any non-trivial programs, and what problems they have had and/or fixed. FIXES & AT&T In September/October last year, I contacted AT&T Unix Europe to ask who I might send some compiler bug reports and some fixes to. I was given an email address and I contacted the person, who was happy to pass on the bug reports and fixes to the development people. NB: Cfront is *not* supported, so this was simply AT&T goodwill, which was much appreciated. In the same note, I (foolishly ?) asked if there would be any objection to my posting the context diffs for these fixes on the net. the response was: 'Frankly, I do not know what the license stipulates. I will check. In the meantime, please do not. Because cfront is not a public domain product, I don't think that exposure of the code like this is the proper thing to do. I will look into this and get back to you to determine what you can do.' I have not heard anything since, though I enquired after it once. So I am unfortunately in the position of having been told by an AT&T employee that sending such fixes may be in breach of our license, and that I should not post any such fixes. Incidently, I have not sent my changes to the task library to AT&T, since I don't think that they really fix the problem. -- Peter Lamb uucp: uunet!mcvax!ethz!prl eunet: prl@ethz.uucp Tel: +411 256 5241 Integrated Systems Laboratory ETH-Zentrum, 8092 Zurich