Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!mnetor!seismo!husc6!rutgers!lll-crg!nike!ucbcad!ucbvax!CITHEX.CALTECH.EDU!carl From: carl@CITHEX.CALTECH.EDU (Carl J Lydick) Newsgroups: mod.computers.vax Subject: Re: hung (sort of) processes Message-ID: <861108025344.002@CitHex.Caltech.Edu> Date: Sat, 8-Nov-86 05:53:57 EST Article-I.D.: CitHex.861108025344.002 Posted: Sat Nov 8 05:53:57 1986 Date-Received: Sun, 9-Nov-86 03:57:38 EST Sender: daemon@ucbvax.BERKELEY.EDU Organization: The ARPA Internet Lines: 55 Approved: info-vax@sri-kl.arpa > All of the programs are written in straight forward fortran, > no AST or sys calls, (except calls to UIS$ccccc). But they all hang > at the same place in the program (but at random times through the loop) > and all show a pc of 8008E204 which is obviously in some shareable > library since the map of the main offending program has no addresses > this high. The first thing you shoud do, since you DO have the offending address available, is to use the SYSTEM DUMP ANALYZER (SDA [invoked in this application by ANALYZE/SYSTEM]) to find out what might be there. I do notice on my VAXStation, though, that the address you specified is sitting between the value of the global symbol RMS ( = 8007740) and the address pointed to by the global symbol EXE$GL_SYSMSG ( = 80002C04 and points to 80090C00) Furthermore, the symbol SYS$GL_UIS ( = 80000EE8 and points to 800B3000) would seem to indicate that to the extent the two machines have similar hardware (since the UIS stuff has to get loaded early on, the rest of the software you've got shouldn't affect this), your problem is not with UIS but with RMS. In particular, I suspect it has something to do with RMS's handling of terminal mailboxes. I've had similar problems using a plain old VT52 emulator on a 780 when I do much with programs that want to grab broadcast messages before they can get to your terminal. There are two modes of hanging that I've seen: 1) The process that is getting between your terminal and broadcast messages (in my case, generally TPU, in yours, UIS) suddenly goes crazy, ignoring attempts to communicate with it, and generally (in every case that I've checked) with an I/O request pending on a mailbox, and user-mode AST's disabled. 2) Similar to 1), except that the situation arises when a subprocess terminates, but the parent process doesn't wake up. This seems to have something to do with the parent disabling terminal interrupts (control-C's, T's, and/or Y's). In disabling these, the parent process manages to disable ALL user-mode AST's, and doesn't want to wake up when the child dies (or maybe the part of the program grabbing broadcast messages grabs a process termination message instead, and doesn't handle it properly. The workaround I've used in these situations is: A) Log in on another terminal and use the SDA to figure out which mailbox is the culprit; B) Do a "SET HOST 0" to log in a job you don't care if you lose; C) With this new process, spawn a SYNCHRONOUS (do NOT use the /NOWAIT qualifier) subprocess. D) Look for the definition of DCL$ATTACH_xxxxxxxx in the job logical name table, and replace that definition with one that references the mailbox that causing the original job to hang. E) Log out from the subprocess; at this point, the parent of that job (the remote job you created with the "SET HOST" command) will be ignoring interrupts from the terminal, will be, in fact, acting just the way the original job was, with one exception: the process that watches for control-Y's in remote jobs will see the control-Y's and let you abort the remote job. F) Verify that the original process is now responding, and log out from the job you started to deal with the problem.