Newsgroups: news.software.b Path: utzoo!henry From: henry@utzoo.uucp (Henry Spencer) Subject: Re: Dynamic "smart" expiration? Message-ID: <1989Dec29.020109.16829@utzoo.uucp> Organization: U of Toronto Zoology References: <1989Dec27.033817.9953@smsc.sony.com> <1989Dec28.063932.13720@robohack.UUCP> <1989Dec28.171830.13130@smsc.sony.com> Date: Fri, 29 Dec 89 02:01:09 GMT In article <1989Dec28.171830.13130@smsc.sony.com> dce@Sony.COM (David Elliott) writes: >>I would rather still have expire do the expiring, rather than rnews. >>This allows more flexibility, not to mention archive support, etc. I >>would definitely not want relaynews to do expiring too! > >Actually, I was thinking more in terms of having newsrun doing the >expiring as part of its loop. Folks have done that with C News, although it's not something we support officially. Possibly we should, but the obvious technique -- dynamically generating expire's control file and cranking down the numbers until space is adequate -- interacts awkwardly with some of the fancier things you can do in the control file. If I can think of some graceful way to deal with this, I'll probably make it available as an option. >The big problem as I see it is that expire is slow (at least the B >news version was), especially if you start adding special heuristics >based on usefulness and group size and file age and number of >subscribers and so forth. C News expire is essentially entirely I/O-bound and dbm-bound (I haven't yet run detailed timings with dbz, although I'll do it soon), so adding a *little* complexity to the decision process would not be disastrous. We were very close to adding the size of the file as another subfield in the history file's middle field, so that it could be used as input for decision making. Alas, it's *not* easy to define exactly how such policies should work in the presence of complications like per-group expiry settings, and we tend to believe in the theory that you should not collect data until you have some idea what you're going to do with it. >If expire generated a list of files to expire once a day, you could >still archive the files, and maintain flexibility, but when it's time >for them to go to make room for other files, it's easy and fast, >and until that time comes, they're still available. I thought a bit about breaking expire into a decision part and an implementation part, so to speak, like this. I wasn't convinced that it offered enough advantages to be worth the effort and possible problems. *However*... note that expire's -t option does almost exactly what the decision module would do: it prints a description of what expire would do, but doesn't do it. The output is *almost* an executable shell file -- at one point it was one, until I noticed that there are some complications like creating directories that are hard to deal with simply -- and picking out the file names would not be hard. I will write up the format in the documentation, so folks can depend on it. -- 1972: Saturn V #15 flight-ready| Henry Spencer at U of Toronto Zoology 1989: birds nesting in engines | uunet!attcan!utzoo!henry henry@zoo.toronto.edu