Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!sdd.hp.com!zaphod.mps.ohio-state.edu!cis.ohio-state.edu!tut.cis.ohio-state.edu!ucbvax!agate!agate!ianb From: ianb@ocf.Berkeley.EDU (Ian Barkley) Newsgroups: comp.sys.apollo Subject: tcpd problems Message-ID: Date: 23 May 91 04:33:25 GMT Sender: root@agate.berkeley.edu (Charlie Root) Distribution: comp Organization: Open Computing Facility Lines: 376 Hello all. We have been having a problem recently with our tcpd. Recently, the connections to the outside world have been having a bad habit of dying or freezing for several minutes at a time. When I looked at the process list of typhoon (our gateway) during one of these cutoffs, I noted that tcpd's PRI was down at 8 or less. I tried running tcpd at lower nice numbers, down to -15, and while this seems to have speeded up net connections in general, it hasn't stopped the not-so-occassional net freeze; and a process listing of typhoon again reveals that tcpd's PRI is at 5 during these freezes. One time I tried running tcpd's nice number up to -18, so that it couldn't get a priority less than 3 (that of most of the other processes on typhoon). It worked fine for a while, then BLAM, the priority went to 3 and tcpd started eating 100% of the CPU time; the disk was accessable (slowly), but I couldn't get on and finally had to white-button the machine. These freezes seem to happen mostly under heavy loads. They last from a few seconds to several minutes. Even when tcpd is working, it's PRI tends to bounce between 1 and 3, even with a nice of -15. Typhoon is a 4500, and our only ethernet gateway. The 16 other machines in the ring are 3500's; typhoon and 8 others have disks. We are running 10.3 in it's BSD4.3-like mode on all machines. Relevant listing: When tcpd is working: [typhoon:~] 39} netstat -T 167/336 mbufs in use: 28/112 ( 80-byte) mbufs used/allocated 63/112 (1560-byte) mbufs used/allocated 76/112 (9216-byte) mbufs used/allocated 1187 Kbytes allocated to network (65% in use) 1757248 requests for memory denied Name Mtu Network Address Ipkts Ierrs Opkts Oerrs Collis Timer lo0 9216 127.0.0 localhost 65 0 65 0 0 0 dr0 1268 128.32.234 typhoon.Berk 10405601 0 12147883 458 0 0 eth0 1500 128.32.184 ocf.Berkeley 11450577 0 9618549 11413 0 0 Host Stat Ref Pct Mapaddr tornado.Berkele U 0 0 00:01:fe:92:00:00 monsoon.Berkele U 0 0 00:01:ff:33:00:00 cyclone.Berkele U 0 0 00:02:01:36:00:00 whirlwind.Berke U 0 0 00:02:01:32:00:00 hailstorm.Berke U 0 0 00:01:ff:22:00:00 bigbang.Berkele U 0 0 00:01:fe:97:00:00 headcrash.Berke U 0 0 00:01:ed:17:00:00 tempest.Berkele U 1 0 00:02:02:1d:00:00 maelstrom.Berke U 1 0 00:01:f7:c3:00:00 sandstorm.Berke U 0 0 00:01:f8:b5:00:00 earthquake.Berk U 0 0 00:01:fb:ca:00:00 inr-1-ocf.Berke UG 2 0 00:00:93:e0:20:60 avalanche.Berke U 0 0 00:01:fc:0e:00:00 tsunami.Berkele U 0 0 00:02:00:e7:00:00 locusts.Berkele U 0 0 00:02:00:ef:00:00 plague.Berkeley U 0 0 00:02:24:5b:00:00 lightning.Berke U 0 0 00:02:01:42:00:00 Routing tables Destination Gateway Flags Hops Ref Use Interface inr-1-ocf.Berke G 1 2 9389386 eth0 routing: 0 bad routing redirects 0 dynamically created routes 0 new gateways due to redirects 0 destinations found unreachable 0 uses of a wildcard route ip: 191 bad header checksums 0 with size smaller than minimum 0 with data size < data length 0 with header length < data size 0 with data length < header length icmp: 0 calls to icmp_error 0 errors not generated 'cuz old message too short 0 errors not generated 'cuz old message was icmp Output histogram: echo reply: 853697 destination unreachable: 16953 echo: 23057 0 messages with bad code fields 0 messages < minimum length 0 bad checksums 0 messages with bad length 853697 message responses generated Input histogram: echo reply: 22981 destination unreachable: 41 source quench: 4 echo: 853697 time stamp reply: 22758 tcp: 1 bad header checksum 0 bad header offset fields 0 incomplete headers udp: 0 bad header checksums 0 incomplete headers 0 bad data length fields Active connections (including servers) PCB Proto Recv-Q Send-Q Local Address Foreign Address Via Gateway (state) 3b7315ac ip 1044 0 *.1025 *.* * 3b73ef30 tcp 0 0 ocf.Berkel.2895 violet.Ber.domain inr-1-ocf.Be ESTABLISHED 3b73d914 tcp 0 0 ocf.Berkel.2894 violet.Ber.domain inr-1-ocf.Be TIME_WAIT 3b740b90 tcp 276 0 ocf.Berkel.telnet iris.eecs..2898 * CLOSE_WAIT 3b7398ec tcp 0 0 *.printer *.* * LISTEN 3b743a20 tcp 0 0 typhoon.Be.1723 maelstrom..finger maelstrom.Be FIN_WAIT_2 3b735df8 tcp 0 0 typhoon.Be.1721 tempest.Be.finger tempest.Berk FIN_WAIT_2 3b73265c tcp 0 0 *.smtp *.* * LISTEN 3b730910 tcp 0 0 *.finger *.* * LISTEN 3b7302cc tcp 0 0 *.ftp *.* * LISTEN 3b730110 tcp 0 0 *.exec *.* * LISTEN 3b727f54 tcp 0 0 *.shell *.* * LISTEN 3b727d98 tcp 0 0 *.login *.* * LISTEN 3b727bc0 tcp 0 0 *.telnet *.* * LISTEN 3b73ee08 udp 0 0 ocf.Berkel.2893 violet.Ber.domain * 3b731d14 udp 609 0 *.syslog *.* * 3b7311b4 udp 0 0 *.1024 *.* * 3b731040 udp 0 0 *.ncs *.* * 3b730ee4 udp 0 0 *.timed *.* * 3b730dbc udp 0 0 *.who *.* * 3b730a08 udp 0 0 *.route *.* * 3b730564 udp 0 0 *.biff *.* * 3b730420 udp 0 0 *.ntalk *.* * [typhoon:~] 40} ps alx F UID PID PPID CP PRI NI ADDR SZ RSS STAT TTY TIME COMMAND 1008001 0 1 1 0 1-20 1 992 13 S < ? 0:19 /etc/init 3 0 2 0 0 17 0 0 0 0 R ? 3626:15 null 3 0 3 0 0 1 0 0 0 0 S ? 0:28 purifier 3 0 4 0 0 1 0 0 0 0 S ? 0:04 purifier 3 0 5 0 0 1 0 0 0 0 S ? 17:47 unwired_dx m 3 0 6 0 0 1 0 0 0 0 S ? 0:00 pinger 3 0 7 0 0 1 0 0 0 0 S ? 52:29 netreceive 3 0 8 0 0 1 0 0 0 0 S ? 59:25 netpaging 3 0 9 0 0 1 0 0 0 0 S ? 2:00 wired_dxm 3 0 10 0 0 1 0 0 0 0 S ? 84:43 netrequest 3 0 15 0 0 1 0 0 0 0 S ? 0:05 netreceive 1008001 0 1046 1 0 1-15 5 1888 93 S < ? 875:34 /etc/tcpd 1008001 0 1055 1 0 3 0 6 1920 58 S ? 0:31 /etc/inetd 1008001 0 1061 1 0 1-10 8 1920 55 S < ? 2:39 /etc/route d -f 1008001 0 1065 1 0 4 0 9 1952 180 S ? 6:06 /etc/rwhod 1008001 0 1068 1 0 5 0 a 2016 361 S ? 9:17 /etc/timed -M 1008001 0 1071 1 0 3 0 4 2080 37 S ? 0:02 /etc/ncs/l lbd 1008001 0 1074 1 0 1 -6 b 6624 1337 S < ? 276:02 /etc/rgyd 1008001 0 1080 1 0 3 0 e 608 130 S ? 2:31 /etc/cron 1008001 0 1092 1 0 3 0 3 2080 27 S ? 0:02 /usr/lib/s endmail -bd 1008001 14 1095 1 0 4 0 c 768 224 S ? 0:17 /sys/spm/s pm 1008001 0 1097 1 0 3 0 d 704 599 S ? 0:15 /sys/net/n etman 1008001 14 1099 1 0 3 0 11 2208 29 S ? 0:31 /sys/ns/ns _helper 1008001 14 1102 1 0 3 0 13 736 9 S ? 0:00 /etc/write d 1008001 14 1106 1 0 3 0 14 928 62 S ? 0:12 /sys/mbx/m bx_helper 1008001 0 2124 1 0 3 0 15 2048 73 S ? 0:00 /usr/lib/l pd 1008001 0 29632 1 0 3 0 22 2272 272 S ? 1:43 /etc/syslo gd 1008001 1132 2542 1095 0 4 0 2 1216 297 S crp00 0:17 /bin/tcsh 1008001 1132 2596 2542 0 3 0 7 1152 200 R crp00 0:00 ps alx 1008001 1132 2592 1095 0 3 0 12 1184 303 S crp01 0:02 /bin/tcsh 1008001 5821 2195 28550 0 3 0 f 672 39 S ttyp1 0:00 sz tape-2. 29 1008001 0 2196 1055 0 3 0 10 2048 61 S ttyp1 0:00 telnetd 1008001 5821 28550 1 0 3 0 20 928 145 S ttyp1 0:05 -csh When tcpd is failing: [typhoon:~] 41} ps alx F UID PID PPID CP PRI NI ADDR SZ RSS STAT TTY TIME COMMAND 1008001 0 1 1 0 1-20 1 992 13 S < ? 0:19 /etc/init 3 0 2 0 0 17 0 0 0 0 R ? 3627:27 null 3 0 3 0 0 1 0 0 0 0 S ? 0:28 purifier 3 0 4 0 0 2 0 0 0 0 S ? 0:04 purifier 3 0 5 0 0 1 0 0 0 0 S ? 17:47 unwired_dx m 3 0 6 0 0 1 0 0 0 0 S ? 0:00 pinger 3 0 7 0 0 1 0 0 0 0 S ? 52:29 netreceive 3 0 8 0 0 1 0 0 0 0 S ? 59:26 netpaging 3 0 9 0 0 1 0 0 0 0 S ? 2:00 wired_dxm 3 0 10 0 0 1 0 0 0 0 S ? 84:45 netrequest 3 0 15 0 0 1 0 0 0 0 S ? 0:05 netreceive 1008001 0 1046 1 0 5-15 5 1888 71 R < ? 876:30 /etc/tcpd 1008001 0 1055 1 0 3 0 6 1920 60 S ? 0:31 /etc/inetd 1008001 0 1061 1 0 1-10 8 1920 55 S < ? 2:39 /etc/route d -f 1008001 0 1065 1 0 3 0 9 1952 179 S ? 6:06 /etc/rwhod 1008001 0 1068 1 0 5 0 a 2016 354 S ? 9:17 /etc/timed -M 1008001 0 1071 1 0 3 0 4 2080 37 S ? 0:02 /etc/ncs/l lbd 1008001 0 1074 1 0 2 -6 b 6624 1403 S < ? 276:03 /etc/rgyd 1008001 0 1080 1 0 3 0 e 608 129 S ? 2:31 /etc/cron 1008001 0 1092 1 0 3 0 3 2080 27 S ? 0:02 /usr/lib/s endmail -bd 1008001 14 1095 1 0 3 0 c 768 221 S ? 0:17 /sys/spm/s pm 1008001 0 1097 1 0 3 0 d 704 599 S ? 0:15 /sys/net/netman 1008001 14 1099 1 0 3 0 11 2208 28 S ? 0:31 /sys/ns/ns _helper 1008001 14 1102 1 0 3 0 13 736 9 S ? 0:00 /etc/write d 1008001 14 1106 1 0 3 0 14 928 62 S ? 0:12 /sys/mbx/m bx_helper 1008001 0 2124 1 0 3 0 15 2048 73 S ? 0:00 /usr/lib/l pd 1008001 0 29632 1 0 3 0 22 2272 261 S ? 1:43 /etc/syslo gd 1008001 1132 2542 1095 0 3 0 2 1216 306 S crp00 0:17 /bin/tcsh 1008001 1132 2604 2542 0 3 0 7 1152 215 R crp00 0:00 ps alx 1008001 5821 2195 28550 0 3 0 f 672 39 S ttyp1 0:00 sz tape-2. 29 1008001 0 2196 1055 0 3 0 10 2048 61 S ttyp1 0:00 telnetd 1008001 5821 28550 1 0 3 0 20 928 145 S ttyp1 0:05 -csh [typhoon:~] 42} netstat -nT 335/336 mbufs in use: 111/112 ( 80-byte) mbufs used/allocated 112/112 (1560-byte) mbufs used/allocated 112/112 (9216-byte) mbufs used/allocated 1187 Kbytes allocated to network (99% in use) 1792761 requests for memory denied Name Mtu Network Address Ipkts Ierrs Opkts Oerrs Collis Timer lo0 9216 127.0.0 127.0.0.1 65 0 65 0 0 0 dr0 1268 128.32.234 128.32.234.1 10407469 0 12150184 458 0 0 eth0 1500 128.32.184 128.32.184.2 11452816 0 9620314 11418 0 0 Host Stat Ref Pct Mapaddr 128.32.234.15 U 0 0 00:01:fe:92:00:00 128.32.234.13 U 0 0 00:01:ff:33:00:00 128.32.234.26 U 0 0 00:02:01:36:00:00 128.32.234.27 U 0 0 00:02:01:32:00:00 128.32.234.24 U 0 0 00:01:ff:22:00:00 128.32.234.30 U 0 0 00:01:fe:97:00:00 128.32.234.31 U 0 0 00:01:ed:17:00:00 128.32.234.28 U 1 0 00:02:02:1d:00:00 128.32.234.29 U 1 0 00:01:f7:c3:00:00 128.32.234.18 U 0 0 00:01:f8:b5:00:00 128.32.234.16 U 0 0 00:01:fb:ca:00:00 128.32.184.1 UG 0 2 00:00:93:e0:20:60 128.32.234.22 U 0 0 00:01:fc:0e:00:00 128.32.234.23 U 0 0 00:02:00:e7:00:00 128.32.234.20 U 0 0 00:02:00:ef:00:00 128.32.234.21 U 0 0 00:02:24:5b:00:00 128.32.234.10 U 0 0 00:02:01:42:00:00 Routing tables Destination Gateway Flags Hops Ref Use Interface 0.0.0 128.32.184.1 G 1 0 9391104 eth0 routing: 0 bad routing redirects 0 dynamically created routes 0 new gateways due to redirects 0 destinations found unreachable 0 uses of a wildcard route ip: 191 bad header checksums 0 with size smaller than minimum 0 with data size < data length 0 with header length < data size 0 with data length < header length icmp: 0 calls to icmp_error 0 errors not generated 'cuz old message too short 0 errors not generated 'cuz old message was icmp Output histogram: echo reply: 853896 destination unreachable: 16956 echo: 23074 0 messages with bad code fields 0 messages < minimum length 0 bad checksums 0 messages with bad length 853896 message responses generated Input histogram: echo reply: 22998 destination unreachable: 41 source quench: 4 echo: 853896 time stamp reply: 22758 tcp: 1 bad header checksum 0 bad header offset fields 0 incomplete headers udp: 0 bad header checksums 0 incomplete headers 0 bad data length fields Active connections (including servers) PCB Proto Recv-Q Send-Q Local Address Foreign Address Via Gateway (state) 3b7315ac ip 7884 0 *.1025 *.* * 3b740b90 tcp 276 0 128.32.184.254.23 128.120.57.20.2898 * CLOSE_WAIT 3b7398ec tcp 0 0 *.515 *.* * LISTEN 3b743a20 tcp 0 0 128.32.234.14.1723 128.32.234.29.79 128.32.234.2 9 FIN_WAIT_2 3b735df8 tcp 0 0 128.32.234.14.1721 128.32.234.28.79 128.32.234.2 8 FIN_WAIT_2 3b73265c tcp 0 0 *.25 *.* * LISTEN 3b730910 tcp 0 0 *.79 *.* * LISTEN 3b7302cc tcp 0 0 *.21 *.* * LISTEN 3b730110 tcp 0 0 *.512 *.* * LISTEN 3b727f54 tcp 0 0 *.514 *.* * LISTEN 3b727d98 tcp 0 0 *.513 *.* * LISTEN 3b727bc0 tcp 0 0 *.23 *.* * LISTEN 3b73ee08 udp 0 0 128.32.184.254.2893 128.32.136.22.53 * 3b731d14 udp 1507 0 *.514 *.* * 3b7311b4 udp 0 0 *.1024 *.* * 3b731040 udp 0 0 *.135 *.* * 3b730ee4 udp 0 0 *.525 *.* * 3b730dbc udp 0 0 *.513 *.* * (netstat freezes after printing that line) Thank you for your time. -Ian Barkley OCF staff ianb@ocf.berkeley.edu