Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!wuarchive!cs.utexas.edu!uunet!algor2.algorists.com!jeffrey From: jeffrey@algor2.algorists.com (Jeffrey Kegler) Newsgroups: news.admin Subject: "Expiration" of news based on .newsrc Summary: Method of expiration perhaps useful at small sites Message-ID: <1989Sep13.070708.20905@algor2.algorists.com> Date: 13 Sep 89 07:07:08 GMT Reply-To: jeffrey@algor2.UUCP (Jeffrey Kegler) Organization: Algorists, Inc. Lines: 65 For those sites with a small and predictable number of news readers, expiration on articles read based on the .newsrc of the people who read news may make sense. The following script generates a list of all news articles, passes it by the a .newsrc file, and removes those not vouched for. It is freely redistributable. The script below can easily be changed to handle multiple .newsrc files by pulling out the stuff that checks the .newsrc and putting it in a loop. As I am a one-person site, I have no way of testing that result, so I leave it as an exercise to an interested party. The location of my news spool directory is non-standard and will probably have to be changed at your site. The method of "expiring" articles (simply deleting them) while crude, was said by Henry Spencer to be reasonable when I asked him about it (nonetheless any flames go to me, not Henry). You will want to continue having date-based expiration to keep your history file sane, but the extra space should enable you to extend the length of time of your date-based expiration. I have run this script for some weeks on algor2 without obvious ill effects, but no warranty is given. It is clearly inappropriate for many, perhaps most sites, but is distributed in the hope it will be as worthwhile at a significant minority of them as it is at mine. === Start of newsclear === if test $# != 1 then echo usage: $0 newsrc 1>&2 exit 1 fi cd /news/spool TMP=/tmp/$$.1 trap "rm -f $TMP; trap 0 1 2 3; exit 0" 0 1 2 3 sed -e ' /^options/d /[^a-z0-9., :+-]/d s/: */ / s=\.=/=g' $1 | sort -b +0 -1 > $TMP dirlist="alt bionet biz comp control dc dr general gnu mail misc ne news pubnet rec sci soc talk u3b um unix-pc" find $dirlist -type f -print | sed -e 's=/\([0-9]*\)$= \1=p' | sort -b +0 -1 | join -a1 - $TMP | nawk ' { read=0 if (split($3, range, ",") == 0) read=1 for (sequence in range) { split(range[sequence], end, "-") if (end[2]+0 == 0 && $2+0 == end[1]) { read=1 } else if (end[1]+0 <= $2 && $2 <= end[2]+0) { read=1 } } if (read == 0) next printf("%s/%d\n", $1, $2) }' | xargs rm -f === End of newsclear === -- Jeffrey Kegler, Independent UNIX Consultant, Algorists, Inc. jeffrey@algor2.ALGORISTS.COM or uunet!algor2!jeffrey 1762 Wainwright DR, Reston VA 22090