Xref: utzoo comp.sys.att:6063 comp.unix.questions:12728 comp.unix.wizards:15430 Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!tut.cis.ohio-state.edu!cs.utexas.edu!uunet!ncrlnk!ncrcae!cs-col!vause From: vause@cs-col.Columbia.NCR.COM (Sam Vause) Newsgroups: comp.sys.att,comp.unix.questions,comp.unix.wizards Subject: Re: Sys V/386/3.2 UNIX system getting hung (?) Keywords: 6386, UNIX, kernel, confused Message-ID: <228@cs-col.Columbia.NCR.COM> Date: 7 Apr 89 13:45:50 GMT References: <6226@homxc.ATT.COM> Reply-To: vause@cs-col.Columbia.NCR.COM (Sam Vause) Organization: NCR Customer Services Support Team, Columbia, SC Lines: 85 In article <6226@homxc.ATT.COM> mrb1@homxc.ATT.COM (M.BAKER) writes: > We have an AT&T 6386E system running UNIX SysV/3.2. > While running our application, it has been observed to > 'hang'. Specifically, the application stops in the > middle of things. More importantly, all the terminal I/O > stops.......including the system console. You can't log > in on a free getty. Anything you > type gets echoed back to the screen, but nothing gets > done with it... Well, it's possible that the clist increment mentioned later in the original posting is actually *hurting* the situation, rather than helping. My experience indicates that this symptom is possibly from a variety of situations, but personal observation leads me to believe that the kernel logical address space is being exhausted. Perhaps the best method of identifying the actual problem symptoms (in the absence of the memory dump), is to use the crash(1m) command on the running kernel to examine the status of the System Page Table Map. Although I am not personally familiar with the way this command executes on other machines, I have used it during kernel debug enough to give you the general expectations: # crash > stat sysname: UnixV nodename: cs-col release: 020001 version: config machine: 68020 time of crash: Fri Apr 7 09:11:05 1989 age of system: 21 day, 23 hr., > map sptmap sptmap address size 00000000 97 00001f99 71 2 segments, 168 units >od maxspace 00e67e18: 00100000 I've included this example from my machine (NCR TOWER 32/600) for your reference. For this system, there are only two segments and a total of 168 units (each is 2K clicks) of System Page Table (SPT) space left. The first segment is reserved for the actual kernel code itself, and is not generally available to the user. The second segment (and any possible following ones) are available to user processes (but not until the fork(2) system call returns...). Since the MAXSPACE kernel configuration parameter is 0x100000, each active process will dynamically sptalloc() 4K of kernel SPT space. (Your mileage (may vary...) For this machine, each 1MB (0x100000) increase to the MAXSPACE parameter will also place an additional 2K burden on each processes SPT requirements. For this machine, I can realistically create only 35 additional processes (71 clicks * 2K / 4K). What this all means is that systems where SPT space is tight will exhibit the symptoms you've described: character echo at the terminal is okay, but no processes appear to be in execution. System degradation appears to occur slowly, rather than "all at once". Generally, no error messages are written to the console. Crash(1m) shown the SPT space to be generally less than 4 segments, with a *total* number of units less than 120. The cure? Well, if possible, increase your Kernel Address Space size. If there is not already a configuration parameter for this purpose, your only alternatives are to reduce the number of buffers and clists, in order to furnish more kernel logical address space for SPT usage, and delete any kernel features and drivers you do not need. Failing this, you get to buy another machine... Perhaps this is not your actual situation, but it sure sounds *PAINFULLY* similar to situations that I've recently encountered.... +------------------------------------------------------------------+ |Sam Vause, NCR Corporation, Customer Services - TOWER Support | |3325 Platt Springs Road, West Columbia, SC 29169 (803) 791-6953 | | vause@cs-col.Columbia.NCR.COM | | ...!uunet!ncrlnk!ncrcae!cs-col!vause | | ...!ucbvax!sdcsvax!ncr-sd!ncrcae!cs-col!vause | +------------------------------------------------------------------+