Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!mnetor!uunet!seismo!rutgers!topaz.rutgers.edu!hedrick From: hedrick@topaz.rutgers.edu (Charles Hedrick) Newsgroups: comp.dcom.lans,comp.protocols.tcp-ip Subject: Re: SUN 3.4 problems Message-ID: <14266@topaz.rutgers.edu> Date: Thu, 27-Aug-87 15:21:58 EDT Article-I.D.: topaz.14266 Posted: Thu Aug 27 15:21:58 1987 Date-Received: Sat, 29-Aug-87 10:07:18 EDT References: <1615@briar.Philips.Com> Organization: Rutgers Univ., New Brunswick, N.J. Lines: 70 Xref: mnetor comp.dcom.lans:781 comp.protocols.tcp-ip:985 I claim no expertise in SunOS 3.4. We are using 3.2 with locally-added networking enhancements that put it somewhere between 3.3 and 3.4 in terms of functionality. However from your results, it sounds like Sun's diagnosis is right. The fact that your hosts all get "ie0: no carrier" or "Ethernet jammed" strongly indicates a broadcast storm. The fact that things work when you use a separate Ethernet suggests that there is no error in your software or setup. However it's not quite right to say that the problem is with your "network". The problem is not with the network itself, but with the hosts on that network. If all of the hosts on it are Suns, then Sun can't entirely avoid blame. 3.4 is based on 4.3BSD's version of IP. 3.2 is based on 4.2BSD's version of IP. Between 4.2 and 4.3, the broadcast address was changed. (The people who changed the standard should be shot. The amount of damage done to networks and the reputation of IP due to inconsistent broadcast addresses is enormous. By the way, this is not Berkeley's fault. The standard actually changed.) Unfortunately, there are various bugs in 4.2 (and presumably Sun 3.2), such that any disagreement over the broadcast address can cause such a flurry of ICMP unreachables and ARP's that the network becomes unusable. The solution is going to depend upon the particular set of machines on your network. You have two choices: find some broadcast address on which everyone can agree, or split the network. 4.3-based systems allow you to set the broadcast address. So do some 4.2-based systems that contain "4.3 enhancements". This includes Ultrix and Pyramid. Unmodified 4.2 systems use net.0 as the broadcast address. E.g. if your network number is 128.6, your broadcast address is 128.6.0.0. The new standard allows either 128.6.255.255 or 255.255.255.255. If you are using subnets, things get more complex. 4.2 didn't support subnets, but if you patched your 4.2 to do so, you will probably have ended up with a broadcast address of net.subnet.0. E.g. for us a typical one would be 128.6.4.0. The new standard, and 4.3, say that the correct broadcast address for a subnetted network is 128.6.4.255. One approach would be to tell your 4.3-based systems (i.e. your Sun 3.4 systems) to use the old broadcast address. There should be an option to ifconfig to do this. What bothers me is that this option may not take effect during the early stages of booting. However the simplest thing to try would be to change the ifconfig commands, normally present in /etc/rc or /etc/rc.boot to contain the appropriate option. Assuming you don't use subnets, this would be something like ifconfig ie0 `/bin/hostname` up -trailers broadcast 128.6.0.0 Everything up to "broadcast" should be whatever your ifconfig command is now. It may be that the option is -broadcast. You should use your own net number in place of 128.6.0.0. You must make this change to /etc/rc.boot for every individual client partition. This means you'll have to bring up the clients one by one single-user or just mount the partitions on the server, using /dev/ndlx (making sure that the clients are not running at the time). You might try this for a few clients to see whether it fixes your problem, before doing it on all of them. In retrospect, Sun would probably have been better off distributing 3.4 with the old broadcast address as a default. Once everyone had upgraded to 3.4, the next release could safely move to the new address, since 3.4 should (if it is properly implemented) accept either. At the very least the setup program should provide this as an option. (Of course I haven't seen 3.4 yet -- maybe it does.) Other approaches to this problem are to fix all your existing systems to accept the new address (which may be the best solution if you have source to them -- we can give you the changes), or to put a gateway between your 3.4 systems and everything else. If you don't have any other kind of gateway, you could add a second Ethernet board to one of your servers and use it as a gateway. Finally, if all of your systems are Suns, the simplest thing to do is simply to upgrade them all at once. Bring them all down, and then bring them up one by one on 3.4.