Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!zaphod.mps.ohio-state.edu!pacific.mps.ohio-state.edu!linac,att!ucbvax!CAEN.ENGIN.UMICH.EDU!paul From: paul@CAEN.ENGIN.UMICH.EDU (Paul Killey) Newsgroups: comp.sys.apollo Subject: mail on apollos ... Message-ID: <50724260e.000b141@caen.engin.umich.edu> Date: 18 Mar 91 23:43:44 GMT Sender: daemon@ucbvax.BERKELEY.EDU Organization: The Internet Lines: 61 Recently people have asked this question: "does this or that version of sendmail work on apollos?" the question can be rephrased: "does this version of apollo work under sendmail?" are there sites out there that deliver or queue 8,000 - 10,000 messages a day on/through their ring? we do. three mods to mail are required to handle this volume of mail. 1. use dbm files for /etc/passwd info, and do not query the rgy all the time. 2. use nR_xor_1W concurrency control and not cowriters, so you are able to have different nodes process files without regard for which node has the disk attached. 3. have sendmail "tempfail" errors like ios_$concurrency_violation, and get clues from the difference between ios_$name_not_found and ios_$object_not_found. Along with #2, this makes alias files much easier to deal with. Also makes it harder to miss someone's forward file. 3a. display the apollo error text, and not just the perror() text. if you see things like sfcb allocation failures, or can't lock pipe errors, just go ahead and reboot. if you use the rgy, and do any volume, you need to make sure that any 0 returned by getpw routines is not accompanied by an errno. if you use the /etc//passwd approach, you will get your errors at open time (you hope) but we have seen hours and hours go by and still a machine will not successfully get the rgy data cached into `node_data/systmp. when we quit using the rgy, we discovered that we ran into other problems, like not getting sfcbs, having mutex locks never released by processes trying to get an IP route, and other fatalities. So, I call proc2_$list() and see how many procs are running. The load average is not sufficient because the # of procs can get quite large without appreciably bumping the load ave. Anyway, I don't know offhand what arrangements the "king james" or IDA releases make for apollos, but in my experience the aforementioned ones have been crucial here. We also spike a dec3100 at load aves. of anywhere from 40-60 doing news and mail, so after a while, you just get used to "big mail." Having said all this, the apollo/domain file system architecture is really very good for doing the definitive distributed mail environment. We do run into loading problems and plain old bugs, but we could not provide the scale of service we do now on anything but apollos, quite honestly. One hopes that other distributed or networked file systems will get better and have features that support the same sort of functionality I've gotten used to on apollos. --paul