Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!zaphod.mps.ohio-state.edu!pacific.mps.ohio-state.edu!linac,att!ucbvax!CAEN.ENGIN.UMICH.EDU!paul
From: paul@CAEN.ENGIN.UMICH.EDU (Paul Killey)
Newsgroups: comp.sys.apollo
Subject: mail on apollos ...
Message-ID: <50724260e.000b141@caen.engin.umich.edu>
Date: 18 Mar 91 23:43:44 GMT
Sender: daemon@ucbvax.BERKELEY.EDU
Organization: The Internet
Lines: 61


Recently people have asked this question:

"does this or that version of sendmail work on apollos?"  the question
can be rephrased: "does this version of apollo work under sendmail?"

are there sites out there that deliver or queue 8,000 - 10,000 messages
a day on/through their ring?

we do.

three mods to mail are required to handle this volume of mail.

1.  use dbm files for /etc/passwd info, and do not query the rgy all
the time.

2.  use nR_xor_1W concurrency control and not cowriters, so you are
able to have different nodes process files without regard for which
node has the disk attached.

3.  have sendmail "tempfail" errors like ios_$concurrency_violation,
and get clues from the difference between ios_$name_not_found and
ios_$object_not_found.  Along with #2, this makes alias files much
easier to deal with.  Also makes it harder to miss someone's forward
file.

3a.  display the apollo error text, and not just the perror() text.  if
you see things like sfcb allocation failures, or can't lock pipe
errors, just go ahead and reboot.

if you use the rgy, and do any volume, you need to make sure that any 0
returned by getpw routines is not accompanied by an errno.  if you use
the /etc//passwd approach, you will get your errors at open time (you 
hope) but we have seen hours and hours go by and still a machine will
not successfully get the rgy data cached into `node_data/systmp.

when we quit using the rgy, we discovered that we ran into other
problems, like not getting sfcbs, having mutex locks never released by
processes trying to get an IP route, and other fatalities.  So, I call
proc2_$list() and see how many procs are running.  The load average is
not sufficient because the # of procs can get quite large without
appreciably bumping the load ave.

Anyway, I don't know offhand what arrangements the "king james" or IDA
releases make for apollos, but in my experience the aforementioned ones
have been crucial here.

We also spike a dec3100 at load aves. of anywhere from 40-60 doing news
and mail, so after a while, you just get used to "big mail."

Having said all this, the apollo/domain file system architecture is
really very good for doing the definitive distributed mail
environment.  We do run into loading problems and plain old bugs, but
we could not provide the scale of service we do now on anything but
apollos, quite honestly.

One hopes that other distributed or networked file systems will get
better and have features that support the same sort of functionality
I've gotten used to on apollos.

--paul