Path: utzoo!utgpu!water!watmath!clyde!att!osu-cis!tut.cis.ohio-state.edu!bloom-beacon!husc6!rutgers!ucla-cs!admin.cognet.ucla.edu!casey From: casey@admin.cognet.ucla.edu (Casey Leedom) Newsgroups: comp.mail.sendmail Subject: Re: gratuitous munging of ats in domain names Keywords: sendmail, spaces in match patterns, tokenization Message-ID: <16463@shemp.CS.UCLA.EDU> Date: 4 Oct 88 15:14:25 GMT References: <691@mailrus.cc.umich.edu> <710@mailrus.cc.umich.edu> <16244@shemp.CS.UCLA.EDU> Sender: news@CS.UCLA.EDU Reply-To: casey@cs.ucla.edu (Casey Leedom) Organization: none Lines: 67 > From: vjs@rhyolite.sgi.com (Vernon Schryver) > > > Hhmmm, why so it does. Amazing. Well, if you change the rule to: > > > > R$+\ at\ $+ $1@$2 > > > > (at least on sendmail 4.12), it starts working correctly. > > I tried this, I think, but can't make it work. With the backslashes, the > rule does not seem to match 'joe at foobar' at all. (I'm assuming > everything we're talking about is a blank, not a tab.) What am I missing? > > SGI's sendmail is currently 5.52 with pathalias hacks similar to IDA, but > done earlier or at least independently. Uh sorry, I should have looked before I leapt. I just tested the fact that it would no longer match on ``user@foo.at.bar''. I forgot to test ``user at foo''. Very sloppy of me. Unfortunately, now that I've been forced to look at it harder than is required for a flip answer, I can see what the problem is. Rules are read by the same code which reads addresses. Hence the interpretation of spaces (per RFC822). [RFC822, section 3.4.2, WHITE SPACE: Note: In structured field bodies, multiple linear space ASCII characters (namely HTABs and SPACEs) are treated as single spaces and may freely surround any symbol. ...] Even if you do get the spaces into the rule properly by quoting them (again as per RFC822 now that we know how sendmail parses rule set LHSs) via: R$+" "at" "$+ $1@$2 It still won't match ``user at host'' because the address string's spaces aren't quoted! That is, the address would have to come in as: To: user" "at" "host And fundamentally we're screwed at this point because by the time our rewriting rules get a crack at an address, this level of parsing has already gone down on the input address so that given the line: To: user at host all we ever see are the tokens "user", "at", and "host". We don't get to see the spaces. What you'd like to do is somehow define a class, say o, that contained all the delimiter tokens, then write the rule as: R$+$~oat$~o$+ $1@$2 but, unfortunately, sendmail's class matching code is just a little weak, so this doesn't work. (Not even Sun's sendmail handle's this bit of class matching and Sun's sendmail through Bill Nowicki's work has a lot of fixes to the class matching code.) The ``user at host'' translation could probably be done through a complicated series of tokenizations and retokenization, but I wouldn't want to try to come up with it, or maintain it, or even look at it. The solution is probably just to remove the rule and let gateways that absolutely have to do it. Better yet, have everyone remove it and force the offending sites generating the addresses to ditch the form. Casey