Xref: utzoo comp.protocols.tcp-ip:7202 comp.mail.misc:1913 Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!mailrus!tut.cis.ohio-state.edu!cheops.cis.ohio-state.edu!pritch From: pritch@cheops.cis.ohio-state.edu (Norm Pritchett) Newsgroups: comp.protocols.tcp-ip,comp.mail.misc Subject: Centralized mail systems summary (LONG) Message-ID: <49488@tut.cis.ohio-state.edu> Date: 23 May 89 16:59:51 GMT Sender: news@tut.cis.ohio-state.edu Followup-To: comp.protocols.tcp-ip Lines: 512 A few weeks ago I posted a query to find out what people are doing with centralized mail systems. I promised to followup with a summary of the responses and to let you know what we were going to do at Ohio State. The latter will be in my next posting. Below are a summary of responses. I will mention a contact name for each of the mail systems listed below but it might not be a person involved with that project - merely a user who described the mail system to me. 1) Sun Microsystems. Of the ones described to me this appears to be the most well known -- I received messages from 5 people plus Sun about it. Contact: Bill Melohn . At Sun we have managed to do this for our some 10,000 employees. We currently use a two-tier method of resolving names from a unix user name (like "melohn") to (ie "bmelohn") syntax. This alias in turns points to the username@mailboxhost for the user. When conflicts occur in the scheme, we make the alias a pipe to a shell script called ambigmail, which sends mail back to the sender with the various GCOS field entries that match the ambig alias. We are in the process of enhancing this scheme by making the second alias from above reference a username@Area, which would allow us to distribute the alias expansion to a series of mailhosts for each area. For mail destined for the Internet, we rewrite outgoing mail headers to be user@Sun.COM; eventually this function will be done on an area basis, with outgoing mail messages that look like user@Area.Sun.COM (as mine does today). 2) UC Davis. Contact: jcgargano@ucdavis.edu (Joan Gargano). I am in charge of our mailname system at U.C. Davis. We have about 20,000 faculty and staff, and 20,000 students. I maintain a database of over 20,000 mailnames, first initial, middle initial, last name, for faculty and staff which I constructed from the payroll files. We have had a number of collisions which I have resolved by altering the middle initial of one of the names. Stanford uses a similar system. You may query our database by using whois: There is a directory that is accessible via the whois program. We have added to the whois program to search our local database through the program or through electronic mail. 3) Purdue University. Contact Dave Stevens (dls@alecto.cc.purdue.edu) . We've just started work on what we call campus-wide electronic mail. We intend to use name server MR records and Profile for the white pages service. We're thinking about using NeXT work stations in a distributed model. 4) Proteon Inc. Contact Alan Marshall . I can recommend CCMail as the product to work with pc networks. This is connectable to SMTP mail via a public domain translator that I have written. It is available from monk.protoen.com as arpagw.arc in the /ftp/pub directory. There is another implementation done from my implementation by Mike Morse (mmorse@nsf.com) that uses about 10k users in the system. He would be a good one to talk with about your needs. I have some code from him too and he has said to make it available. Perhaps there is a more current version that would help. 5) MIT. Contact Mark Rosentein . We've been thinking of tackling this problem here at MIT. Our initial planning is as follows: * The full name of every member of the MIT community will be known to the mail hub. Mail sent to someone's full name will result in: 1) The mail is delivered if the name is unique and the person has a mailbox 2) An error response is generated saying "[full name] does not have an electronic mail address, please send mail to MIT Room ..., Cambridge MA 02139" 3) An error response is generated saying "[full name] is ambiguous, please choose one:" followed by a list of people giving the name, title, address, and a unique email identifier. 4) An error response saying "addressee unknown". * Every member of the MIT community will be given a unique identifier for email purposes. For most active email users, this will be their login name. For other people and those with name conflicts, it will be their initials and a number, similar to the NIC's whois database. This information will be kept up-to-date by Moira, the Athena Service Management System, and regularly updated on the mailhub. Users will be allowed to update some of their own information, and to become unlisted if they want to. Moira currently contains all of the necessary information for the students here at MIT, only the staff and remaining faculty must be added. The primary development effort will be modifications to the mail hub. 6) NCR. Contact Matt Costello Well, I can help out some on this. I designed the system in use throughout NCR and it conforms to your requirements. There are ~1300 people with email addresses in San Diego. I believe Dayton (Galactic Headquarters) has around 6000 email addresses in it. What I did was to create a database external to the mail system and then have the mail router look up certain addresses in this external database. Any mail addressed to the domain name, or having a period in the username will be looked up in this database. The database format is a rolodex(tm) format. My entry is simply name Matthew Costello phone 2926 dept 4796 email mattc@ncr-sd This database format is simple to manipulate and edit using the standard unix tools, but there are also 7 programs (rolo, roloeach, roloedit, roloenter, rolorev, rolorpt & rolosort) that handle it more efficiently. The command rolo(1) is used to look up the local portion of the name using a fuzzy matching technique, so the following will all find my entry and get my mail to me: matthew.costello matt.costello costello I'm the only Costello in San Diego pat.costello m.castelli [ Accepting the last two was a mistake. It would be better to fail and then [ return the close matches. Because of the fuzzy matching the search must be linear through the whole file. To compensate our mail router is able to cache found addresses in a separate file so they only get looked up once. I would recommend using an "initial substring match" which is amenable to indexing. -- Matt Costello (CSNET) +1 619 485 2926 uunet!ncrlnk!ncr-sd!mattc --- Matt Costello (CSNET) +1 619 485 2926 uunet!ncrlnk!ncr-sd!mattc 7) Universite Catholique de Louvain. Contact Alain FONTAINE . We have established an unified address scheme here. But we did not find any way to allow external correspondants to send mail to an individual when only knowing his name, *and* avoid clashes... This seems theoretically impossible. The sender must *know* and *specify* some more information to garantee uniqueness. So the addresses used are of the form : personal-identifier@unit.ucl.ac.be, where 'unit' is the standardized three or four letter sigle of the laboratory or service in which the person can be found. Of course, it is difficult for an external correspondant trying to contact somebody for the first time to guess the 'unit' to be used. On the other hand, clashes are a very low probability event, since units never count more than 50 persons. Implementation : the DNS would be a marvelous tool for this, since each unit could have and manage its own name server. Halas, (one of my favorite gripes), the arbitrary division of mail addresses into a local and a domain part makes it impossible to use the DNS down to the individual level. So the current situation is that one centralized machine contains a centralized database of mail routing information, and nearly all domain-addressed mail goes physically (uh, should we say that about zeroes-and-ones on wires and disks and ...) through that machine. 8) Carnegie-Mellon University. Contact Craig Everhart . Andrew supports 8500 user names reasonably gracefully, though we've given up on making login-names guessable; too many collisions. Instead, we use a White Pages service to map name probes to mailboxes, letting it handle any collisions. My free advice to you would be to forget making names unique; they never will be. Make login-names unique and provide simple ways to map from person-names to login-names (and use them for delivering incoming mail). MCI Mail, with a cast of many hundred thousand, did the same thing; everybody's mailbox is a number. 9) University of Illinois at Urbana. Contact Paul Pomes . The Computing Services Office at the University of Illinois at Urbana is in the process of creating a university-wide mailing system. The system is comprised of three pieces. The largest is the white-pages system created by Steve Dorner of CSO. It's based on the CSnet central name server (qi - Query Interpreter). Each student and staff member is assigned a unique alias. The user is allowed to change the issued alias provided it remains unique. Associated with this alias is the user's preferred email address, office address, home address, phone numbers, etc. Everything that is in the paper phone book is also in the qi database. The user client is a program called ph. It searches on the unique alias and can fuzzy match on names. Providing ancillary information such as department or curriculum narrows the search. The second piece is the 5.61+IDA sendmail release. The ida/cf/Sendmail.mc has been very slightly modified to invoke a new mailer, phquery, whenever an address resolves to @uiuc.edu. This is configured with the DOMAINMASTER option. Phquery is the third piece. It examines its arguments and calls qi to determine the preferred email address for the supplied name. At this point, name can be only the unique qi alias. This restriction will soon be lifted to allow phquery to resolve full names (e.g., paul-pomes@uiuc.edu -> paul@uxc.cso.uiuc.edu), and amateur radio callsigns (e.g., ka9wgn@uiuc.edu -> phil@vmd.cso.uiuc.edu). In the case of ambiguous matches, phquery will return a list of possibilities that includes department and/or curriculum information that should allow the sender to make the next attempt successful. Future enhancements include automated printing and campus mailing of messages to those users w.o. email addresses. Source for the qi (central server) and ph (user client) can be obtained via anon-FTP from uxc.cso.uiuc.edu:/net/{ph,qi}. The phquery code, when ready, will be included in the /mail/sendmail/uiuc directory. Sorry, we cannot email this code as it is much too large. Chocolate chip cookies with a postpaid tape will work wonders though. 10) University of Virgina. Contact Tom Sigmon I am responding to the request in BIG-LAN regarding university-wide electronic mail networks. We here at the University of Virginia have created such an environment that addresses most of the points that were brought up. Our electronic mail environment currently encompasses over 300 machines (not PCs, etc.) having many different mailers running on many different operating systems. I'll try to summarize the basic points here and if there are follow-up questions, I'd be happy to address them. - we use domain addresses only. If a user wants to send mail to someone on a non-domain network, then they must use an appropriate "pseudo-domain" within a domain address. For example, sending mail to someone on Bitnet would require an address of the form "user@host.bitnet". Likewise, sending mail to someone on a UUCP host requires an address of the form "user@host.uucp" (our mailers figure out the best path to the target host). - third-level domains within the "virginia.edu" domain are named after departments or other University organizations (usually using the standard registrar's designation) - departments create whatever fourth-level domains or machine names (the usual case) that they desire - we here in the Academic Computing Center created and maintain a database of every faculty, staff, or student associated with the University. The basic data comes from the registrar's database and the payroll database from our administrative computing center. - as part of this database, we automatically create unique mail ids for every single person associated with the University. These ids are also (conveniently) used as the login id on most machines. The format of this unique id is as follows: person's initials optionally followed by a 2-character suffix whose first character is a digit and whose second character is alphabetic. For instance, my mail/login id is simply my initials, "tms". All other people who have the same initials as me have a suffix on their mail id, e.g., tms2x, tms4g, etc. Obviously, the choice of format and a priori creation of unique mail/login ids is the most controversial part of our environment. There are advantages and disadvantages of this system which I won't go into unless someone is interested. - since all of these mail ids are unique, they can be considered to be "aliases" in the "virginia.edu" domain. We support the notion of "registration" which creates a mapping between a person's unique mail id (in the virginia.edu domain) and the actual account and domain where that person reads his mail. For example, all mail sent to "tms@virginia.edu" will be delivered to the place where I actually read my mail which is "tms@boole.acc.virginia.edu". Thus, no one needs to know the details of exactly where I read my mail. Every system in our environment allows users to set/change their registration since it is done via a mail message to one of our main mail servers. Most systems wrap a shell script around this registration process so that it is very easy for the user to register or make changes. - the above "registration" process is very important for mail coming to users from networks that don't support domain names (e.g., Bitnet and UUCPnet) as well as to present one "name" for the University to the outside world. In these cases, if a user is not registered, then our mail servers would not know where to actually deliver the person's mail *especially* since we want to present one "name" to the outside world (i.e., it shouldn't be necessary for anyone to know the full domain name in order to send mail to someone at the University, nor should they need to know the internal network configurations, etc.). For example, the University has one name/address on both Bitnet and UUCPnet. We are "virginia" on both networks, so someone on Bitnet can send mail to me as "tms@virginia" without regard to the actual machine I use to read my mail, and likewise, someone on UUCPnet can send mail to me as "...!virginia!tms" without regard to the actual machine I use to read my mail. - we also support user-created aliases at the virginia.edu level. If a user does not like their automatically created unique mail id or would simply prefer to have other aliases, then they can request the creation of such aliases. For example, in addition to sending mail to me as "tms@virginia.edu", people can also send mail to me as "sigmon@virginia.edu" or "9240615@virginia.edu" (which is my phone number). The only restriction that we place on these user-created aliases is that they (obviously) must be unique, can not conflict with the regular expressions that describe our automatically-generated ids (so that they don't preempt future automatically-generated ids), and that they be "reasonable" (e.g., we don't allow people to be MickeyMouse, nor GeorgeBush, etc.). - of course, none of the above prohibits users from having aliases in other domains. The Academic Computing Center administers ids and aliases at the virginia.edu level for the entire University. Departments are free to have their own aliases in their own domains (except that mail coming from non-domain-based networks can't access them for obvious reasons). - the University telephone book has a section that lists the electronic mail ids and aliases for all registered mail users at the University. In addition, we support a "whois" capability on many of our machines that allows users to interactively query our database to determine mail ids, phone numbers, department affiliations, etc. Hope this helps others establish university-wide mail networks. I'm happy to provide more detail or answer questions, just send me mail! 11) Stanford University. Contact Bob Morgan . Yes, assigning what we have come to call a "unique-id" to a campus-full of people is a tricky issue. We have made a few abortive attempts at campus-wide mail delivery (unique-id@stanford.edu), but have run into the twin problems of a) choosing the unique-id, and b) letting people update their unique-id/actual-mailbox mapping without involving great piles of paper/bureaucracy/our time. Right now we generate unique-ids for use with a phone-book-type service (based on Whois, RFC 912), using the following algorithm, moving down the list in case of name clash: 1) first-initial/last-name (rmorgan), or 2) first-initial/middle-initial/last-name (rlmorgan), or 3) as-many-initials-as-necessary/last-name (rlfmorgan), or 4) as-many-letters-of-first-name-as-necessary/last-name (robmorgan), or 5) first-name/middle-initial/last-name (robertlmorgan), or 6) 5) with digits as necessary appended (robertlmorgan3). (If you have a Unix "whois" client, you can bang our server with: > whois -h argus.stanford.edu some-string) Looking through our 153 Smiths, I see no uses of rule #6, about 5 cases where #5 was used, and several instances of repeated application of #4 (dasmith, davsmith, davismith, davidsmith). I suspect that if we started using this for actual mail delivery (or, even more so, for Kerberos-style principal ids), some people would complain and insist on choosing their own. The question then becomes, how do you decide what's reasonable? If Joe Student wants to be known as "donaldkennedy" (SU's president), is that OK? Part of the problem is that if these things are assigned immediately when people arrive (as they must be) then people will be stuck with something before they know what it's about (as with real names, I suppose). No solutions, just more questions, 12) Dartmouth University. Contact Steve Campbell . Dartmouth has a scheme much like the one you describe, called the Dartmouth Name Directory. The DND is a database of about 13,000 names with corresponding nicknames, password, paper-mail address, phone number, department (or undergraduate class), and e-mail address. Mail addressed to Joe.Blow@dartmouth.edu goes to Joe Blow's preferred e-mail address, as it is recorded in the DND. People are uniquely identified by the tokens in their name -- the name space is small enough that first name + middle initial + last name is unique in all but a very few cases. Those people have their middle names entered also. The names, nicknames, and departments are all lookup keys and partial matches are supported. So you can mail to me with "James.W.Matthews@dartmouth" (my full name), "James W M" (abbrieviating the last name) or "Jim Matthews" (matching my last name and a nickname). Only one token match must be exact. If there are multiple matches a bounce message is generated, listing the matches (as long as there are fewer than fifteen or so). So it is fairly easy to refine an address to the required precision. The DND is seeded by the personnel and registration systems, and several fields (paper mail address, e-mail address, phone number, and nicknames) are user-maintained. The default e-mail address is our paper mail system -- messages are printed out and hand delivered. 13) "Track". I suggest you consider a software distribution to control the mail software. See _1989 USENIX Software Mangement Workshop Proceedings_ for a good discussion or two or three on some software called Track. 14) UCSD. Contact Brian Kantor . UCSD has such a mail system. You may query it over the net to see what it looks like with the 'whois' command; try whois -h ucsd.edu smith whois -h ucsd.edu jsmith and variations along that line. The software to implement this is available in our anonymous FTP directory; take file pub/mailreg.tar.Z. Caveat Emptor: the software is continuously being refined and is not documented. 15) University of Kent at Canterbury. Contact . We operate an unified mail system for some 4000 staff and students. We use a centralised admin server which allocates a unique userid for each user. In addition it will allocate a login (also unique) for the machines they require. On top of the admin server we run a mail database system (original designed at Edinburgh University). The user interface to this is the mailhost program. the user may nominate any machine he can log into as his mail machine. He does this by typing "mailhost -c". At night all the machines swap lists of users. Each entry in the list has a date stamp, if this stamp is later than the local machines recorded gate stamp for the user the entry is updated. An extension to this is being worked on at the moment which allows psuedo domains. A group of workstations (typically) have a pseudo domain centered on their fileserver. This is still experimental. The system has been in use for about three years now with no problems. 16) UMCP. Contact Mark Feldman . I know of a few places that do this sort of thing. First, there's the UMAIL project here at UMCP. One can send mail to various user names at host umail.umd.edu, and the mail gets delivered, even if the user doesn't have a computer account anywhere. (In that case, mail gets printed out and sent via campus mail.) I am by no means fully up on the details of this system; you might talk to Mark Feldman (feldman@umd5.umd.edu) to get more information. He may not be the right contact, but he can probably name the right people if asked. ... Contact Steve Third, I'm implementing something that, while it doesn't do everything, does most of what you want (and which can be extended to do more). Basically, it's an automatic method of generating a 'global' mail address database, made from the union of the password and alias files for a particular department. Getting all such files for a whole campus would be hard, but there's also a facility for getting just another department's global database and merging it in with others. There's a concept of locality-of-reference, too; if I send mail to 'root', I get the UMIACS root, even though there's a 'root' in the CS Department's imported information. This locality can be guaranteed either manually or automatically. If extended with some software to generate full-name addresses (First.MI.Last), and some software to handle duplicates differently, my code would probably do everything you need. The only hitch is that the stuff I've written is not yet in extensive use (so I'd bet it has, er, misfeatures), and it's essentially undocumented. If you want a copy of the code as it stands, I could provide one. It's solid (it's even been Saberized), but it's definitely in need of some major cleaning up... 17) ATT Private Mail Exchange. Contact Rod Hart . Look into the AT&T Private Mail Exchange System (PMX). My organization is in the process of installing one right now to solve a similar problem. We need x.400 in order to tie the various user groups (ie. DG, Proffs, Wang, PC, and of course Unix) together as well as document conversion. -=- Norm Pritchett, The Ohio State University College of Engineering Network Internet: pritchett@eng.ohio-state.edu BITNET: TS1703 at OHSTVMA UUCP: pritch@sydney.columbus.oh.us CCNET: ENG::PRITCHETT (6172::PRITCHETT)