Xref: utzoo comp.newprod:280 news.admin:5908 news.software.b:2274 Path: utzoo!utgpu!watmath!looking!brad From: brad@looking.on.ca (Brad Templeton) Newsgroups: comp.newprod,news.admin,news.software.b Subject: NewsClip news filtering language and newsreading tool Summary: A compiler that lets you write programs to turn USENET into the network that you want it to be. Message-ID: <3495@looking.on.ca> Date: 9 Jun 89 03:51:24 GMT Reply-To: clarinet@looking.on.ca Followup-To: news.admin Organization: Looking Glass Software, Waterloo Ont. Lines: 182 Approved: brad@looking.on.ca (With the permission of Ron Heiby) Class: news, commercial Looking Glass Software Limited & ClariNet Communications Corp. announce the: NewsClip Programming Language (A USENET news filtering tool.) NewsClip is a programming language that lets you filter the netnews you feed and read as finely as you care to code. Compiled NewsClip programs can be used as super-fancy kill files, or as the core of a whole netnews system. If you think there's too much USENET news to read, then we think NewsClip is your answer. You can tune down USENET as much as you like. It doesn't do AI or replace a moderator, but it can help a great deal. Using NewsClip, you write programs in a C-like language to describe the USENET (or ClariNet) articles you would like to see. You can use expressions based on anything that appears in an article header, the presence of regular expressions in sections of the article body or header strings, or statistics about the article. (Like the number of lines of included text) Your programs are turned into a C program which is then compiled and linked with the NewsClip library into an executable that can filter news. For example, you could reject articles in news.groups that are crossposted to soc.women by saying: reject if is news.groups && is soc.women; You can accept or reject articles, or give them weights so that they are accepted if they meet enough good criteria, or rejected if they meet enough bad ones. You could even say: if( count(newsgroups) > 3 || subject has "test.*message" || (distribution_level >= dlevel(#usa) && newsgroups has "^talk" ) ) adjust -20; (This gives a negative weight to articles crossposted to more than 3 groups, or which are test messages, or which are in the talk hierarchy and posted to a distribution equal to or larger than the "usa" distribution.) If you want, you could get your program to reject all followups to articles with signatures over 10 lines that mention ronald reagan or were crosposted to more than 3 groups, unless they were written by Tim Maroney. You can get as fancy as you like -- but it's also very easy to code simple but powerful things. NewsClip is worth it if all you wish to do is make anti-kill files or reject article trees based on the References chain. The NewsClip database feature lets you remember things, like users you like or don't like, message-ids or subject of articles you want or don't want to see the followups to etc. How it filters: The compiled program can be used in a number of ways. o) You can run it stand-alone. It reads your .newsrc, checks all your new articles and marks as read the ones you don't want to read. o) With a few mods to RN or other newsreaders, you can run it in parallel, talking to RN via pipes. This causes filtering as-you-read, usually with no delay, unlike KILL files. o) You can filter a list of articles, removing rejected files from the list. This allows you to control a batch feed to finely tune feeding of your site or another site. Why transmit what you don't want to read? This can save lots of money, modem time and disk space. o) You can arrange to feed a site from a .newsrc instead of the "sys" file. The filter program checks the .newsrc, filters the unread articles, and outputs the filenames of the unread articles for output to sendbatch. Fine tune subscription as much as you like.o (One neat trick is to "or" together all the .newsrc files on your system, and ship this every day to your feed site to be used as your system's subscription file. You thus feed exactly the groups people are reading on your system. If everybody unsubscribes, the feed stops. If people subscribe, it starts one day later.) NewsClip can do more than remove what you don't want to read. Instead, it can show you only what you do want to read -- the reverse of a kill file. You can vary this from group to group or message to message. For example, in large volume groups, I configure NewsClip to only show me new (non-followup articles) If I like an article, I mark it as a topic I am interested in. It's like reading a group that's really 10 times smaller. The real power comes when you update your filtering under program control. Say you really hate a user, for example. You can easily make your newsclip program reject all articles from that user with a simple: reject if From == "ihate@badsite.com"; Better still you can code: if( From == "ihate@badsite.com" ) { badmessages[Message_id] = true; reject; } if( badmessages has References ) reject; This not only rejects the bad user's articles, it puts them in a database so that all followups to those articles are also rejected. It's as though the offending user doesn't even exist on your net. That's what NewsClip is for. It lets you turn USENET into just the network that you want to see, without imposing your will on anybody else. When you link your program and RN, you can have RN send commands to the filter program to update the program's databases. (The 'badmessages' variable above is a database.) In this way, you can use the databases like powerful kill (or accept) files. Other features include: The ability to expire items from databases that have not be referenced in a specified time. The ability to examine the distribution of an article so that you only see local messages in certain groups. The ability to scan unsubscribed groups for rare important messages, and do an automatic re-subscribe when one arrives. The ability to compile quick search programs that can scan all or part of the news spools for key articles. Variables, procedures, functions, switch, for, while etc. Define your own header lines as you need them. Link to C library functions or your own C code. NewsClip comes with source to the compiler, libraries and sample programs. There is an extensive typeset manual, for which troff source is also available. You do not need special permissions to use it on your system, and it can even be distributed in binary form. It works with BSD, SYSV and Xenix. If you value your time, or your long distance money, NewsClip will pay for your ClariNet subscription or licence fee in a very short time. How to Get it: NewsClip is provided at *no charge* while you are a ClariNews subscriber, and to ClariNet site subscribers above the $30/month level. If you are not a ClariNet site, or you leave ClariNet, a licence can be purchased for a fee that depends on the site size. Contact us for details. We can ship via uucp from our own site or UUNET, or on Xenix or IBM-PC floppy disk. Who to contact: Write to clipinfo%clarinet.com@uunet.uu.net for more info and sample programs. Or phone us at 1-800-265-2782 or 1-519-884-7473. If your site can't mail to the above addres, try clipinfo@looking.on.ca. Or write: ClariNet Communications Corp. 124 King St. N. Waterloo, ON N2J 2X8 I will give a short talk about NewsClip at the Usenix work in progress session, and will discuss it during the USENET interfaces BOF there. (NOTE: If you run NNTP, currently the only way to use NewsClip is to link with RRN for dynamic filtering. The stand-alone mode does not currently work if the news spools do not exist on your machine.) Don't complain about it. Filter it. -- Brad Templeton, Looking Glass Software Ltd. -- Waterloo, Ontario 519/884-7473