Path: utzoo!attcan!uunet!seismo!sundc!pitstop!sun!amdcad!ames!mailrus!tut.cis.ohio-state.edu!ukma!rutgers!ucla-cs!admin.cognet.ucla.edu!casey From: casey@admin.cognet.ucla.edu (Casey Leedom) Newsgroups: comp.sys.apollo Subject: Re: some questions for the gurus. Message-ID: <16020@shemp.CS.UCLA.EDU> Date: 16 Sep 88 10:12:43 GMT References: <8809141502.AA01500@testnode.mit.edu> Sender: news@CS.UCLA.EDU Reply-To: casey@cs.ucla.edu (Casey Leedom) Organization: UCLA Lines: 40 In article <8809141502.AA01500@testnode.mit.edu> krowitz@testnode.MIT.EDU (David Krowitz) writes: > Why would a random user *need* to shut down a node anyhow? Opps! I suddenly realize I may have been arguing on the wrong side all this time ... :-) I shut down my node sometimes two or three times a day as it gets locked in some weird state or another. A typical case is our gateway (also an Apollo) goes down and I have several telnet/rsh connections running through it. For some reason this really screws my node up and I'm forced to reboot. (This is happening a lot recently - I think our Apollo gateway is running out of mbufs or something similar.) If I couldn't shut my machine down I'd be in a tough spot. I have the root password, but what about all the other users? Since I'm Mr. Support (can you really believe that? :-)), I'd be getting calls constantly to come over and reboot so-n-so's node. Ack!!! Seriously, I fully agree with David that shut should be reserved for root people just to prevent accidents. However, since we do need to reboot the nodes so often ... This need may go away when we bring up SR10 (we just got our tape a couple of days ago - yeah!), but since I have the hanging problem on broken TCP connections with TCP3.1 and as far as I know SR10's TCP is identical, I doubt if it will go away entirely ... I'll certainly endorse David's suggestions that shut warn you about things like: > 1) processes CRP in from another node > 2) diskless partners that were currently booted off the disk > 3) files that were opened from other nodes > 4) if the node is a gateway Casey P.S. Does anyone know exactly why the broken TCP connections hang a node and what can be done about them? Or even what to do about an Apollo node being used as a gateway that seems to be reacting unfavorably to increased usage (but it should be pointed out that this increased usage hasn't even begun to make the gateway run slowly, etc.).