Path: utzoo!utgpu!watserv1!watmath!att!bu.edu!xylogics!samsung!munnari.oz.au!mundamutti.cs.mu.OZ.AU!kre
From: kre@cs.mu.OZ.AU (Robert Elz)
Newsgroups: comp.protocols.tcp-ip.domains
Subject: Re: problems with nsfnet-relay.ac.uk
Message-ID: <kre.668023269@mundamutti.cs.mu.OZ.AU>
Date: 3 Mar 91 18:01:09 GMT
References: <1666.667855170@xtel.co.uk> <9103021528.AA09106@rimfaxe.diku.dk>
Sender: news@cs.mu.oz.au
Distribution: inet
Lines: 33

thorinn@RIMFAXE.DIKU.DK (Lars Henrik Mathiesen) writes:

>That is *wrong*, it should accept the connection, send a 421
>reply code (service unavailable), and close the connection.

This sounds like a good thing to do, but in practice, it can
be a disaster.  It takes some cpu time to do this, which isn't
something you have available in these circumstances (and remember,
its not just one of these things, is possibly dozens, all at once).

But worse - the characteristics are wrong, on receiving a 421,
servers typically queue the message and attempt to send it again
later.  If lots of 421's are sent at the same time, then its likely
that you will get lots of mail being retransmitted an hour later,
overloading the recipient system yet again, and so on.

But, if the SYN is simply not answered, or probably better a RST
is sent, then the sending mailer will typically try a secondary
MX, and leave the mail there - that mailer can then forward the
mail to the primary MX (or handle it otherwise) - which it will
generally do by serialising the messages and sending one at a time,
vastly decreasing the instantaneous load on the receiving server.

Of course, this only works if there is a secondary MX, which isn't
the case for messages MX'd to nsfnet-relay.ac.uk - but that could
be fixed.

The secondary MX also has the effect of preventing most of the burst
load after a link has been down in any case, by collecting all of
the messages while the link was down, and then delivering them serially
when it returns, imposing a nice steady load.

kre