Path: utzoo!attcan!uunet!seismo!sundc!pitstop!sun!amdcad!ames!lll-lcc!lll-tis!helios.ee.lbl.gov!pasteur!agate!saturn!edler@cmcl2.nyu.edu From: edler@cmcl2.nyu.edu (Jan Edler) Newsgroups: comp.os.research Subject: Re: Question on thread/process scheduler interaction Message-ID: <5460@saturn.ucsc.edu> Date: 14 Nov 88 19:26:30 GMT Sender: usenet@saturn.ucsc.edu Organization: New York University, Ultracomputer project Lines: 77 Approved: comp-os-research@jupiter.ucsc.edu This topic is a few weeks old, but we had a some kind of news problem here delayed my posting this. In article <5296@saturn.ucsc.edu> fouts@lemming. (Marty Fouts) writes: >N tasks (threads) are started on top of M processes on a >multiprocessor with P processors (N >= M >= P > 1). ... >One task enters the critical region in one process. While there, a >second task running on a different process spinlocks on the semaphore. ... >The process level scheduler preempts the process running the task in >the critical region leaving the second process at the spinlock. The >worst case is when the process is preempted by a third process >belonging to the same job which also reaches the spinlock. Our solution to this problem is a mechanism called "temporary nonpreemption". A process informs the kernel of temporary conditions that make preemption unadvisable. The scheduler in the kernel honors this advice, up to a limit to prevent abuse. This solves the problem as long as the maximum time in the critical secion is less than the limit the scheduler is willing to honor. The implementation is very efficient. Two special communication variables are used. They are ordinary integers, residing in the user's address space, but their location is known to the scheduler. Both are initialized to 0. The first is set to a non-zero value by the user to request temporary non-preemption, and the second is set to a non-zero value by the scheduler to warn the user that preemption will soon be required. We assume the scheduler runs like an interrupt handler. Here is pseudo-code: tempnopreempt() /* request temporary nonpreemption */ { var1++; } tempokpreempt() /* relinquish temporary non-preemption */ { if (--var1 == 0 && (temp=var2) != 0) { var2 = 0; yield(); return temp; } return 0; } Yield() is a new system call that reschedules the processor. Because var1 is incremented and decremented, these calls nest. When the scheduler wants to preempt the currently running process, it executes code like this: if (var1 == 0) ok to preempt; else if (preemption not already pending for this process) { var2 = 1; /* notify user */ note preemption pending; } else if (preemption pending for maximum allowable time) { var2 = 2; /* time is up! */ ok to force preemption; } The pending preemption status is cleared on actual preemptions and on yields(). The purpose of the tempokpreempt() return value is to notify the user (after the fact) if preemption was requested or forced. Overhead is only a couple instructions in the common case where no preemption would have occurred anyway, and only the overhead of yield() otherwise. The most abusive a user can get is to lengthen a time slice by a small amount, and this can be compensated for by shortening the next time slice. Jan Edler NYU Ultracomputer Project edler@nyu.edu ...!cmcl2!edler (212) 998-3353