Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10.3 alpha 4/3/85; site cbosgd.UUCP Path: utzoo!watmath!clyde!burl!ulysses!allegra!mit-eddie!genrad!panda!talcott!harvard!seismo!cbosgd!mark From: mark@cbosgd.UUCP (Mark Horton) Newsgroups: net.lan,net.periphs Subject: Interlan boards and evil waves Message-ID: <1044@cbosgd.UUCP> Date: Sat, 13-Apr-85 13:24:00 EST Article-I.D.: cbosgd.1044 Posted: Sat Apr 13 13:24:00 1985 Date-Received: Wed, 17-Apr-85 01:36:33 EST Organization: AT&T Bell Laboratories, Columbus Lines: 91 Keywords: ethernet, interlan, sun Xref: watmath net.lan:756 net.periphs:733 We've run up against an interesting property of our Ethernet, and would like to report it and see if anybody else has seen the same behavior. Our Ethernet looks like this (roughly): terminator (50 feet) cborion (diskless Sun 120 with 3Com board and 3Com xcvr) (150 feet) cbcephus (VAX 785 with Interlan board, no software installed yet) cbtac (Bridge CS/100 with modified Interlan xcvr) (150 feet) cbosgd (VAX 750 with Interlan board, Interlan xcvr) (5 feet) cbhydra (Sun 170 with 3Com board, Interlan xcvr) (50 feet) cbpavo (Sun 120 with local disk, 3Com board, 3Com xcvr) (200 feet) terminator We've been having a severe noise problem between hydra and orion, the Sun net disk protocol will report that the net disk isn't responding, wait several seconds, report that it's OK again, then immediately report that it's not responding again. This varies wildly - sometimes it's so severe you can't get any work done, other times it's fine. We've run a TDR (it shows the cable is fine) and replaced all the taps, xcvr cables, and so forth. We even tried plugging pavo into orion's slot, and pavo has the same problem. Orion works fine in pavo's slot. Pavo does have problems with the net disk going away, but they are brief and rare, (3 error messages per day worse case) and don't affect use of the machine. The one thing we had not replaced was the Ethernet cable itself, so we suspected it. But you know what ia pain is to run 300 feet or more of cable, and we didn't have the foresight to run a second cable, so we were reluctant to substitute a cable. Well, yesterday we got fed up and dragged out the spool of cable. After confirming that the xcvr, xcvr cable, and Sun 120 could all be swapped and that the problem just occurred at that location on the Ethernet, we ran a second cable from cborion to about 50 feet short of cbosgd (bypassing cephus and the tac.) We were amazed when it showed exactly the same problem, especially since the cable just ran down the hall, not up in the ceiling. The problem was pretty severe yesterday - orion was in a catatonic state, we couldn't even get a response from the shell. When booting it, the problem reproduced very consistently - we would get 2 to 5 ?'s during the boot sequence (on top of the -'s and ='s while the netdisk booted.) So we started to simplify things, and first we unplugged the xcvr cable from cbosgd. The problem magically went away. Putting back the real cable and plugging/unplugging osgd's xcvr cable repeatedly confirmed that this was 100% correlated with the problem. OSGD was somehow putting evil waves out onto the net that kept Orion (which is physically the furthest away) from reliably talking to Hydra. (Our cable person insists that he's had OSGD unplugged before and the problem remained, I can only report what we observed yesterday, and speculate that there may be other factors here that don't meet the eye.) We swapped the transceivers between Hydra and OSGD - same result. We changed the xcvr cable, the little 10 foot board-to-xcvr cable, and swapped between 3 different Interlan boards. Same result - if anything it got worse with the other boards. I would like to know if anyone else out there has seen this or a similar problem. If you recognize it and have a nice solution, I'd be interested. If you know something about the Interlan board or the 750 that might explain this, I'd sure love to hear about it. Note that there is also an identical Interlan board in Cephus, but this doesn't seem to matter, it's not causing any problems. (Cephus runs System V Release 2, and TWG still hasn't installed the TCP/IP we ordered, so it's just sitting there not doing anything.) Note also that what mattered was whether the cable was plugged into OSGD, that OSGD is NOT putting out any traffic that would swamp the network (that we know of) - it even happens when OSGD has just been booted single user. Finally, note that I understand that the Interlan board has a known problem that it LISTENS to garbage on the net so it can't receive reliably from certain Ethernet boards that use the Seeq chip, but this appears to be a problem with noise being TRANSMITTED by the board. (We're blaming the board because I don't see how the 750 could be responsible itself.) Our next things to try include using a 3Com transceiver on the Interlan board, and getting our hands on a DEUNA or a 3Com board. (We have an Excelan board we could use if there were any software that we could install without doing a major porting job. I understand BRL has done the port but I don't know enough to know how to get or install it in an existing system.) We would also like to plug in some kind of Ethernet monitor so we could look at what's on the cable; rumor had it that the MIT IBM PC code had such an animal, but the copy we just got from Sparticus doesn't have it (just a ping program.) Any pointers to such a tool (preferably for a PC or a Sun 120) would also be appreciated. Mark