Path: utzoo!utgpu!jarvis.csri.toronto.edu!cs.utexas.edu!tut.cis.ohio-state.edu!ucbvax!urz.unibas.ch!doelz From: doelz@urz.unibas.ch (Reinhard Doelz) Newsgroups: comp.sys.sgi Subject: RE: login problem. Message-ID: <220*doelz@urz.unibas.ch> Date: 1 Mar 90 07:30:13 GMT Sender: daemon@ucbvax.BERKELEY.EDU Organization: The Internet Lines: 105 GREAT. I was reporting this problem to SGI in fall 1989, and they (ISO) told me thatothers encounter the same problem. I hacked a workaround which temporarily fixes the problem, because even 3.2.1. didn't fix it. The following is a repost of a message I sent to INFO-IRIS in december. Hope this helps... Reinhard =========================================== We are running a 120GTX OS 3.2.1. The program shown below runs on two processors. The graphics manager fails to start up and the graphics is unusable (No window manager). *DOCUMENTATION:* The /usr/adm/SYSLOG says (truncated, only significant lines shown) Dec 7 09:17:00 modl grcond[290]: CIO: IRIX System V Release 3.2 IP5 Version 10171414 Dec 7 09:17:00 modl grcond[290]: CIO: CPU 1 taking over time and accounting functions Dec 7 09:17:00 modl grcond[290]: CIO: gfx_wait_cx: context switch timed out Dec 7 09:17:00 modl grcond[290]: CIO: Dec 7 09:17:00 modl grcond[290]: CIO: gm-2 (configured for IP5) 1.14+ Dec 7 09:17:00 modl grcond[290]: CIO: Dec 7 09:17:00 modl grcond[290]: CIO: DEBUG_NOISE at 0x9806648C Dec 7 09:17:00 modl grcond[290]: CIO: Loading PP ucode Version: @(#) PEAPOD 1.2 pp microcode assembler - 6/20/87 Dec 7 09:17:00 modl grcond[290]: CIO: Sat Aug 19 19:10:21 1989 user unknown revision(1.123CLOVER2IP5GT) Dec 7 09:17:00 modl grcond[290]: CIO: tried and failed ... as reported. ... and therefore I conclude that the grcond is unable to start up. *WORKAROUND:* The IRIS is fully networked running nfs, 4DDN and TCP/IP thus eventually suffering from this. Therefore, I changed the kernel in /usr/sysgen/master.d to read the network on CPU0 as follows: 107c107 < #define NBUF 100 /* # buffers in disk buffer cache */ --- > #define NBUF 400 /* # buffers in disk buffer cache */ 215c215 < #define MAXSC 26 --- > #define MAXSC 30 353c353 < int network_processor = 1; --- > int network_processor = 0; modl [/usr/sysgen/master.d] % ... did an lboot and problem solved. *PROBLEM REPRODUCTION:* The fortran program causing the problem is a special application. However, a C program will do it as well. The following is a dummy routine which performs the crashes: real x(100,1000), y(100,1000) seed=123456 do 100 i=1,100 do 101 ii=1,1000 x(ii,i)=rand(seed) 101 continue 100 continue write (6,*)'ran done' do 200 i=1,100 do 201 ii=1,1000 y(ii,i)=sin(x(ii,i))*cos(x(ii,i)) y(ii,i)= * (y(ii,i)**1.003)**(1-(sin(x(ii,i))/1000)) do 202 iii=1,900 y(ii,i)= * y(ii,i)**(1-(sin(x(ii,i))/1000)) 202 continue 201 continue 200 continue stop end pfa concurrentizes the 200 do loop which gives a fully paralelly running program. *ALTERNATIVE WORKAROUND:* In order to avoid the kernel modification you could also log in from another (not the console) terminal, or even log in as root NOGRAPHICS, call the debugger saying dbx -p # ( # being the parallel job) which is equivalent to sending a schedctl call to this process. One could do it more elegantly by using a small C routine but I didn't bother about that. ************************************************************************ * Dr. Reinhard Doelz * SWITZERLAND * * Biocomputing * * * Biozentrum * doelz%urz.unibas.ch@relay.cs.net * * Klingelbergstrasse 70 * * * CH-4056 Basel * * ************************************************************************