Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!usc!zaphod.mps.ohio-state.edu!mips!decwrl!ucbvax!THUMPER.BELLCORE.COM!nsb From: nsb@THUMPER.BELLCORE.COM (Nathaniel Borenstein) Newsgroups: comp.soft-sys.andrew Subject: Some REAL confessions about AMS speed Message-ID: Date: 11 Jul 90 12:01:49 GMT References: <8aacULUB0KGW44iMMw@holmes.parc.xerox.com> Sender: daemon@ucbvax.BERKELEY.EDU Organization: The Internet Lines: 132 Excerpts from internet.info-andrew: 10-Jul-90 Re: messages - a confession Bill Janssen@parc.xerox. (882) > My pet peeve is message drop-off time between `sendmessage' and > /usr/lib/sendmail. It hangs for dozens of seconds -- unlike /bin/Mail, > or GNU Emacs's version of sendmail. Why is that? And why should the > mail-reading tool be locked up during that time? Why isn't that forked > off in the background? Are there some sendmail switches set wrong? Yes -- there are sendmail switches set wrong for all other mailers in the world, and only AMS sets them right. When AMS tells you "your message has been sent" it means it. Other mailers quickly come back and pretend everything is OK when it isn't really. If you're willing to live with the risks that other mailers give you -- notably that sendmail will die right away for lack of memory and your mail will never get sent and you'll never hear about it -- it is very easy to make things work faster. Just use the "OldSendmailProgram" AndrewSetup variable to point to something other than /usr/lib/sendmail for your "sendmail" program. The thing it points to can be a shell script that calls sendmail with whatever options you like; just don't complaint to me when you find that your mail occasionally disappears into a black hole. Actually, while I'm confessing things, I think I'd like to go on record as believing that many of Messages' performance problems can be traced to the fact that we tried to make it extremely reliable, more so than other mailers. I just had an exchange of personal mail with Bill Cattey on the topic, and I'll quote from what I told him on the subject of why it sometimes takes a long time to incorporate new mail from your mailbox into the AMS database: ---------------- Begin long quotation from mail to Bill Cattey ---------------- I hate to say it, but I think the real problem is that the code is too careful, and thus tends to amplify any file system performance problems. Consider the basic "inc" scenario where you take a single piece of mail out of /usr/spool/mail/xxx and put it into .MESSAGES/yyy/+zzz (this comes from memory so it may miss a step or two): 1. Open /usr/spool/mail/xxx. 2. Lock /usr/spool/mail/xxx. 3. Open .MESSAGES/yyy/.MS_MsgDir 4. Lock .MESSAGES/yyy/.MS_MsgDir 5. Open .MESSAGES/yyy/.AMS_DIRMOD (This will be our trace if thing get aborted mid-operation) 6. Open .MESSAGES/yyy/+zzz (the body file) 7. Write .MESSAGES/yyy/+zzz (the body file) 8. Close .MESSAGES/yyy/+zzz (the body file) 9. Write .MESSAGES/yyy/.MS_MsgDir 10. Fsync .MESSAGES/yyy/.MS_MsgDir (We don't close it now because we may be processing multiple things, but we need to fsync it to make the data safe.) 11. Unlink .MESSAGES/yyy/.AMS_DIRMOD 12. Truncate /usr/spool/mail/xxx to zero length. 13. Close .MESSAGES/yyy/.MS_MsgDir Now, all of these steps can be justified as necessary in terms of reliability. But consider what a mail interface that was willing to trade off a little reliability for some performance could do instead (and I believe this is what many mailers actually do): 1. Open /usr/spool/mail/xxx. 2. Open .MESSAGES/yyy/.MS_MsgDir 3. Open .MESSAGES/yyy/+zzz (the body file) 4. Write .MESSAGES/yyy/+zzz (the body file) 5. Close .MESSAGES/yyy/+zzz (the body file) 6. Write .MESSAGES/yyy/.MS_MsgDir 7. Truncate /usr/spool/mail/xxx to zero length. 8. Close .MESSAGES/yyy/.MS_MsgDir So by eliminating all the locking and synchronization, we've also eliminated over a third of the network file system calls. Now consider an interface like MH, which (if I recall correctly) doesn't have any index files. It's file system calls could be reduced to the following: 1. Open /usr/spool/mail/xxx. 2. Open .MESSAGES/yyy/+zzz (the body file) 3. Write .MESSAGES/yyy/+zzz (the body file) 4. Close .MESSAGES/yyy/+zzz (the body file) 5. Truncate /usr/spool/mail/xxx to zero length. Now, actually I think that MH (like nearly all mailers) does lock the /usr/spool/mail file, but doesn't have any index files and doesn't lock anything else. This means that it does 6 file system operations where AMS does 13. I seriously doubt that you need to look any further to find the performance differences on the "inc" operation. ---------------- End quotation ---------------- Now, for a mail system explicitly designed for large-scale bboard support, a lack of index files would be crazy. And we were just dead serious about making the system really reliable -- possibly too serious, but I still don't believe that. The bottom line is that if you try to build a reliable database on top of a distributed UNIX file system, it's going to be VERY slow. Most mailers give up on reliability, we gave up some speed. You pays your money & you takes your choice... Now, one thing that wouldn't be too hard to add to AMS would be a "LiveDangerously" preference. It could avoid a lot of file locking and fsync'ing and could give sendmail the "fork and be happy" option. There would really be a minimal amount of coding necessary to provide such an option, and AMS would then be as unreliable as any other mailer, but I'd really hate to do it -- people would lose mail and then complain about AMS losing their mail! Well, the above diatribe may not make AMS run any faster for you, but perhaps it will give you some more insight into what it's doing when it's too slow... -- Nathaniel