Xref: utzoo unix-pc.general:2487 comp.sys.att:5871
Path: utzoo!censor!becker!ziebmef!cks
From: cks@ziebmef.uucp (Chris Siebenmann)
Newsgroups: unix-pc.general,comp.sys.att
Subject: Re: Question about windows and processor time (3b1)
Message-ID: <1989Mar19.014937.19765@ziebmef.uucp>
Date: 19 Mar 89 06:49:34 GMT
References: <356@flatline.UUCP> <450@amanue.UUCP> <1211@lemuria.usi.com> <451@amanue.UUCP>
Reply-To: cks@ziebmef.UUCP (Chris Siebenmann)
Organization: Journeyman Ultrix/BSD Kernel Hacks, UofToronto Branch
Lines: 64

In article <451@amanue.UUCP> jr@amanue.UUCP (Jim Rosenberg) writes:
| My main argument still stands.  It's my understanding that sleep/wakeup is
| used by the kernel to manage resources which are *NOT AVAILABLE*.  E.g.  if a
| block is needed in the buffer cache and none is available the process will
| sleep until one is available.  That is *NOT THE SAME THING* as saying that the
| buffer cache as a data structure is *PROTECTED AGAINST CORRUPTION* by
| sleep/wakeup.  I believe in fact this just not the case.

 sleep()/wakeup() can be used to protect data, and some things do use
it that way (for example, locking an in-core inode -- if the inode is
already locked, you sleep() waiting for it to be unlocked so you can
lock it). Most things just use it to wait for resources to become
available.

| I believe in fact
| there is *no* general mechanism by which critical regions of code in the
| kernel protect data structures against corruption that would occur from being
| arbitrarily reentered except the two I mentioned:  (1) Knowledge and care that
| a race condition is in fact not possible; (2) disabling interrupts. 

 Certainly there doesn't seem to be one in Ultrix/BSD. Any amount of
code relies on its ability to go merrily traipsing down linked lists
of buffers, for example, without any locking at all. I've even (ahem)
written some. It's a very real worry, because you suddenly have to
figure out which kernel routines can sleep() and perhaps let someone
else in to destroy that data structure you've been carefully building.

| Now answer me one question.  If the kernel is so hunky dory inside, just why
| does a driver writer have to know about spl()??  (To know *A LOT* about spl()
| in fact.)  In my opinion a driver writer should only need to know how to mask
| interrupts for the device being driven.

 This should be sufficient as long as the driver is only manipulating
data structures 'owned' by it (either private data structures or
things like buffers it's putting information into). It's when you get
into things like multiple ethernet interfaces at different levels that
you get into trouble -- and watch out for simple locks, lest you wind
up deadlocked in a high-interrupt condition.

| I'd be amused to see how long your kernel you run would last if all
| the spl()'s in all the drivers you use were excised in favor of
| sleep/wakeups.  I bet you wouldn't be amused at all.

 It wouldn't last long at all, in fact. Remember that sleep()/wakeup()
take place in the context of a process; interrupt routines have no
process context to do a sleep() in (actually, they 'have' a process
context -- the context of whatever random process happened to be
active when the interrupt happened). This lack of interrupt context
bites NFS in BSD systems badly; the server side of NFS is a program
that forks itself N times and then immediately dives into the kernel,
never to return. It has to be a process because the NFS routines need
to both sleep() for disk IO and for incomming requests.

 If people are interested in a paper on what sort of things are
needed, I'd recommend Bach's paper on adapting the kernel for
multiprocess systems in the AT&T Bell Laboratories Technical Journal,
Vol 63 No 8 (reprinted as UNIX SYSTEM READINGS AND APPLICATIONS,
Volume II).

-- 
	"Though you may disappear, you're not forgotten here
	 And I will say to you, I will do what I can do"
Chris Siebenmann		uunet!{utgpu!moore,attcan!telly}!ziebmef!cks
cks@ziebmef.UUCP	     or	.....!utgpu!{,ontmoh!,ncrcan!brambo!}cks