Path: utzoo!mnetor!uunet!husc6!rutgers!rochester!pt.cs.cmu.edu!andrew.cmu.edu!cfe+ From: cfe+@andrew.cmu.edu (Craig F. Everhart) Newsgroups: news.admin Subject: Message-ID bugs in nntp and netnews? Message-ID: Date: 11 Apr 88 21:34:22 GMT References: <8804090600.AA14895@Larry.McRCIM.McGill.EDU> Organization: Carnegie Mellon University Lines: 71 I happened to notice the message from der Mouse arguing that Message-IDs aren't supposed to be case-sensitive. We both agreed that according to RFC822, they're case-sensitive (well, the part to the left of the @-sign is case-sensitive). Here are parts of an interchange we had on the subject. > Date: Sat, 9 Apr 88 02:00:17 EDT > From: der Mouse > To: cfe+@andrew.cmu.edu > Subject: Re: Message-IDs: how they're built > > Yes, we running andrew.cmu.edu are under the impression that > > Message-IDs are case-sensitive. I can find you chapter and verse in > > RFC822 if you like. > I've found it myself. (I was surprised by this when I noticed it in > nntp. Struck me as a bit of a misfeature, and now you tell me it's an > out-and-out bug.) > > Does netnews not conform? > Well, our copy of 2.10.3 doesn't. And I haven't touched the history > file code. > > How does it differ in this regard? > When accessing the dbm-format history file, Message-IDs are lowercased. > The text history file contains the Message-ID in the original case, but > the dbm file contains it lowercased, and on lookup of course it is > lowercased as well. > > What might suffer if two different messages, with message-ID fields > > differing only in case, hit the netnews distribution mechanism? > Systems that have this bug, of which there are doubtless many, would > reject whichever article they receive later, under the impression > they've already seen it. It will thus neither appear at nor propagate > through such sites. Thus, there's a clear problem. We read the standards as permitting case-sensitive local-part's in Message-IDs (even if the domain part can be case-folded), yet widely-distributed software doesn't follow that convention. How about if Netnews software only lowercased the @domain part of message-IDs instead of the whole thing? This would be correct, and would also deal with the principal source of varying case in mail headers: varying capitalizations of the same domain name. As you can probably tell, Andrew message-IDs are a bunch of bits (time, composing machine's IP address, other stuff) spelled out in a large-base number (currently it's a base-64 number). If the problem is serious enough, we could switch to a smaller alphabet that doesn't require that upper and lower case be treated distinctly. This would, of course, make the message-IDs longer, and no more transparently decodable. We went to the base-64 scheme a year or so ago because our Message-IDs were too long for some old netnews systems (longer than 64 characters), so we're a little reluctant to make yet another change (to conform to netnews requirements) without knowing if there are additional (#$@*%*) constraints that we'd also have to meet. I'm interested in hearing about (a) any additional constraints, other than what's already in RFC822, (b) whether there's hope of getting ``the current'' news software fixed if it's indeed broken in this way, and (c) how far one might realistically hope such fixes might propagate, and how soon. If there's a better forum for this discussion, I'm happy to be educated. Thanks, Craig Everhart Andrew message system Internet: cfe+@andrew.cmu.edu