Path: utzoo!attcan!telly!robohack!woods From: woods@robohack.UUCP (Greg A. Woods) Newsgroups: news.software.b Subject: Re: Dynamic "smart" expiration? Summary: how about a goal oriented expire? Keywords: expire,C news Message-ID: <1989Dec28.063932.13720@robohack.UUCP> Date: 28 Dec 89 06:39:32 GMT References: <1989Dec27.033817.9953@smsc.sony.com> Organization: R. H. Lathwell Associates: Elegant Communications, Inc. Lines: 77 In article <1989Dec27.033817.9953@smsc.sony.com> dce@smsc.Sony.COM (David Elliott) writes: > The "removability" of a file would be a function of newsgroup > name, newsgroup size, and file age. One might use a formula > like: > > removability = (X*size^2 + Y*age^2) - usefulness(newsgroup) > > The "usefulness" function would be a table of constants supplied > by the administrator. The values of X and Y are supplied for > each newsgroup to give weight to these items. > > For example, the table entries > > # Group Usefulness Size Age > rec.music 15 3 5 > rec 10 1 1 This is similar to some ideas I had recently. Your article has inspired me to put my thoughts on paper, so to speak: I would rather still have expire do the expiring, rather than rnews. This allows more flexibility, not to mention archive support, etc. I would definitely not want relaynews to do expiring too! Your usefulness field would be a factor, between 0 and MAXINT, used to prioritize newsgroups. The size field would be the desired number of articles to be kept in the spool. This number would be decremented, taking into account the usefulness factor, if space was really tight. Expire would still pay attention to the Expires: header, with the same three value control field as it currently has, in place of your suggested age field. The 'retention' value would have highest priority, but with the usefulness factor applied if space was really tight. The 'normal' value would be of lower priority than size, and if null the Expires: header would be followed explicitly, unless the 'purge' date over-rides it. The 'purge' value would also outweigh both usefulness and size, but could of course be left null, or set quite large. In addition, expire would be given a goal of free space) to be achieved. (i.e. a '/freespace/' line like '/history/'.) Expire would still use spacefor to determine its success. Expire would then become a multi-pass process, but I don't think this would impair its speed much. In order to enhance performance, I would place the article byte size in the history file (though block size would be more useful). Since all cross-references are already noted by newsgroup, it is very easy to calculate the potential gain if an article is expired, while keeping in mind the various explist control lines for the article. There could even be a flag to determine the effect on cross posted articles. Either the quickest, or the longest, expire could be used for all links, or each link could be expired separately, with space gained only upon expiration of the last link. In case rnews runs out of space in spooling incoming news, it can simply wait for space, as it normally does. I currently have the newswatcher script run hourly and it runs an emergency expire when space becomes tight. For now I have a series of expire scripts which are run in sequence until sufficient space is freed. With a goal oriented expire, this would be unnecessary, and indeed an emergency expire would only be required during news floods. Of course there must be sufficient space in your spool directory for incoming uucp jobs while expire runs. I am always careful to isolate /usr/spool/news, and I usually have a separate /usr/spool/uucp, and if not, at least a separate /usr/spool. Also on the disk space vs. news issue, I've been thinking of changes that would be nice in spacefor and it's users in order to have finer control in identifying space in in.coming, news spool, out.going, uucp spool, etc. -- Greg A. Woods woods@{robohack,gate,tmsoft,ontmoh,utgpu,gpu.utcs.Toronto.EDU,utorgpu.BITNET} +1 416 443-1734 [h] +1 416 595-5425 [w] VE3-TCP Toronto, Ontario; CANADA