Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!utgpu!water!watmath!clyde!rutgers!seismo!lll-lcc!ames!ucbcad!ucbvax!ucbarpa.Berkeley.EDU!fair From: fair@ucbarpa.Berkeley.EDU.UUCP Newsgroups: news.sysadmin,news.admin Subject: Re: Negative Article ID's Message-ID: <18662@ucbvax.BERKELEY.EDU> Date: Fri, 1-May-87 06:25:09 EDT Article-I.D.: ucbvax.18662 Posted: Fri May 1 06:25:09 1987 Date-Received: Sat, 2-May-87 14:15:28 EDT References: <3673@drutx.ATT.COM> <1987Apr27.131502.8062@sq.uucp> Sender: usenet@ucbvax.BERKELEY.EDU Organization: USENET Protocol Police, Western Gateway Division Lines: 37 Xref: utgpu news.sysadmin:167 news.admin:331 Summary: message-ids are NOT just "unique strings" In article <1987Apr27.131502.8062@sq.uucp> msb@sq.UUCP (Mark Brader) writes: It should perhaps be pointed out, in case anyone else is confused, that the Message-ID in news is simply a string, and any site is allowed to generate it in any format they like. This is absolutely, flatly FALSE. There is a standard for USENET message headers, defined in the document RFC850, which is included in the B news 2.11 distribution. It cites that the message-id should be in RFC822 syntax, excepting that a USENET message-id should never contain linear white space (spaces or tabs). RFC822 is even more grindingly specific about the format and content of a message-id; I once wrote a parser/syntax checker for it. For instance, check out the Message-ID on THIS article. It was generated by C news (the as-yet-unreleased system described by Geoff Collyer and Henry Spencer at the last USENIX conference). I presume they use this format because it is easy to generate reliably in sh. It does happen that the message-id that you cite (reproduced in the first line of this article) is legal according to the specifications. Knowing both Henry & Geoff as I do, I really doubt that it was an accident that the format is legal (and, as you note, trivially generated from a shell script). When I get around to recoding the USENET to ARPANET side of the gateway on ucbvax (the process in that direction is mostly an AWK script, and it will be handling increasing volumes as time goes on, so it is now time to recode it in C; going the other way, it's already in C: a new version of recnews), the message-id syntax checker will be included in that program, and it will refuse to gateway any USENET article that does not conform to spec. where is Mr. Protocol when I need him? Erik E. Fair ucbvax!fair fair@ucbarpa.berkeley.edu