Xref: utzoo news.admin:6912 news.misc:3594 Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!rutgers!apple!vsi1!octopus!avsd!childers From: childers@avsd.UUCP (Richard Childers) Newsgroups: news.admin,news.misc,alt.conspiracy Subject: Who's Messing With The Usenet ? Keywords: article, duplication, monkey business Message-ID: <2056@avsd.UUCP> Date: 15 Sep 89 19:46:49 GMT Reply-To: childers@avsd.UUCP (Richard Childers) Organization: Metaprogrammers International Lines: 175 I've been noticing a lot of duplicate articles recently. Now, I've been reading the Usenet, on and off, for about five or six years now, and I have _never_ seen anything like this in my life. At first, I assumed that it was something wrong with my installation, because I was mucking around with things, and in fact as a result of trying to get the IHAVE / SENDME protocol to work with multiple sites I _was_, for a while, getting considerable duplication. But a fair amount of rigorous thinking in the past month has convinced me that I'm not the problem here. I've been reading for, oh, over a month now, about how duplicate articles have been appearing across many newsgroups. This was my first hint that it was a widespread problem that was affecting everybody. Like everyone else, I watched and waited for someone to do something, for someone to trace the problem. Nothing happened. Oh, many people tried to find a pattern, but it wasn't there. I began to smell a rat. Now, like everyone else, I have gained much from judicious application of the excellent phrase, "Assume stupidity before malice", derived from Bill Davidson's .signature, I believe, and I'm still not convinced that what's happening is anything other than normal hoseheadedness. But the absence of any sort of pattern in the data acquired from articles' headers makes me wonder, because it is quite common for people to modify headers for their own immature reasons. ( Personally, I think it's equivalent to changing the address on a letter, or otherwise defacing a piece of mail, without the owner's permission. Quite inconsistent with the tenets of intercooperation around which the Usenet was founded, more suggestive of children fighting over a toy than it is suggestive of a tool developed for civilized and globally relevant purposes of advancing human knowledge and accomplishment, if you know what I mean. ) So, I finally vowed to explore the matter the next time it bugged me and I had a few spare moments. Here's some actual real data, made from a small and thus uncomplicated sampling of a small and low-traffic newsgroup, alt.bbs. avsd# cd /usr/spool/news/alt/bbs avsd# ls 231 232 233 234 235 236 237 238 239 240 241 A small sample, about 11 articles, all less than two days old. avsd# grep Message-ID * 231:Message-ID: <4347@udccvax1.acs.udel.EDU> 232:Message-ID: <11290@kuhub.cc.ukans.edu> 233:Message-ID: <533@sud509.RAY.COM> 234:Message-ID: <534@sud509.RAY.COM> 235:Message-ID: <537@sud509.RAY.COM> 236:Message-ID: <9626@venera.isi.edu> 237:Message-ID: <935C18IO029@NCSUVM> 238:Message-ID: <37058@conexch.UUCP> 239:Message-ID: <11519@kuhub.cc.ukans.edu> 240:Message-ID: <37058@conexch.UUCP> 241:Message-ID: <11519@kuhub.cc.ukans.edu> Ah, we have some duplicate articles, two out of the last three ... avsd# grep Path 239 241 239:Path: avsd!vixie!decwrl!wuarchive!kuhub.cc.ukans.edu!orand 241:Path: avsd!octopus!sts!claris!apple!brutus.cs.uiuc.edu!... ...wuarchive!kuhub.cc.ukans.edu!orand Now we have some real data. There are three machines which are found in both "Path:" fields. Two of them are the source and the destination. The third is "wuarchive". Now, at this point it would normally be appropriate to run, screaming accusations all the way, straight to the administration of "wuarchive", all self-righteous as all get-out, demanding to know what they are doing. But I'm not sure they are doing _anything_, because I'm assuming that I'm not the first person who has approached this problem in this fashion. So, instead, I'm going to take a leap of the imagination and try to imagine why such a situation might occur, what circumstances might impinge upon the Usenet that would lead to massive forgery and duplication. The answer that occurs to me is, quite bluntly, sabotage. It is a well- -established trick, made exemplar by the actions of senior Usenet people, to generate forged headers, as I said before, and insert them into the queue. These articles, given their untraceable nature, are very possibly forged articles. The sites found in the "Path:" field are, presumably, interconnected ... which argues for a fairly sophisticated and detailed effort, not the act of an average college student, whom would presumably still be so dazzled by the wealth of information available that s/he would never think of sabotaging it, incidentally. No, if such an act is being perpetuated, it is coming from an individual or group thereof with considerable attention to detail. Why would someone do such a thing ? I can think of several reasons. (1) Jealousy. There has been considerable territorialism lately, people posting to moderated groups and the like, commercialist behavior. Some people prefer to make sure that if they can't play, nobody can play. (2) Disinformation. The Usenet represents a substantial and sophisticated alternative to normal channels of communi- -cation, one less subject to control through covert or coercive activities, as many of the sponsors are Real Big Corporations, not necessarily willing to agree with the marching orders of a hypothetically interested government. Remember COINTELPRO, multiply by several orders of magnitude where information processing capacity and expertise are concerned, divide by the number of Constitutional Amendments you've seen waylaid recently, and tell me what you get. (3) Stupidity. Someone has some inscrutable motive, or there is a random scattering of badly-installed netnews sites that appear to approach a significant minority and are scattered fairly evenly through the United States. ( Perhaps the next phase in this research might be to coordinate efforts to identify source(s) by collecting the name of every machine that _appears_ to be a problematic machine, using methods outlined above, and examine this with an eye for statistical anomalies or patterns of placement. For instance, they might all fall within a few states. If they are evenly distributed geographically, that is possibly evidence of a sophisticated effort to muddy the trail, and important to establish. ) (4) Malice. Some group has acquired sufficient expertise and a invisibly coordinated set of Usenet sites, ostensibly independent sites, positioned them in positions of moderate but not excessive visibility amongst the crowd, and are using their position to damage the Usenet's interconnectivity. Why, you say ? What's the point ? Well, I think there is a clear end result here, and it's clogging the channels. Duplicate an article here and there, one per newsgroup per day, and pretty soon some of the lesser sites are filling up their disks. Soon, the administrations are calling for things to be omitted from the 'spool' partitions. Maybe the entire news installation might be deinstalled, perhaps only those parts of it irrelevant to the specific commercial mission of the individual companies. It's been going on for quite a while now, and it's gotten rather noticable at my site. If we hadn't enlargened our spool partition, we might still be getting regular "filesystem full" messages, and that was with a _lot_ of space and 'expire' getting rid of everything less than two days old. I don't know who's doing it. To tell you the truth, I'm still prepared to find an error in my thinking, all down the line. But it _seems_ to be common everywhere, and so I hesitate to discount my hypotheses until I hear from a few others on the topic, the results of their own research, and their thoughts / hypotheses. I do know it needs to be fixed, since it won't fix itself, wherever it's coming from. I'm also curious if this type of thing has been encountered in other networks, such as FIDO, which certainly has the circumstances under which such things might happen. The problem is that I don't know if they have restricted inter- -connectivity to conform with requirements for linearity, or have allowed potential looping paths to evolve in their interconnections, compensating with article ID checks in the software. I must admit that I'm puzzled as to how this is happening, as netnews is _supposed_ to be checking articles against the 'active' articles database. Perhaps the "Message-ID:" field is being invisibly corrupted, or the software decides by comparing Message-ID and Path, classifying them as identical only if they _both_ match, to avoid the vague but present possibility of two articles from divergent sites being generated with identical Message-IDs Anyhow, some thoughts that have been brewing for about two weeks. I'd like to hear some responses ... reactions will be reacted to in a vein similar to that in which they were conveyed in, but intelligent commentary will receive the respect it deserves. -- richard -- * * * Intelligence : the ability to create order out of chaos. * * * * ..{amdahl|decwrl|octopus|pyramid|ucbvax}!avsd.UUCP!childers@tycho *