Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!tut.cis.ohio-state.edu!gem.mps.ohio-state.edu!uakari.primate.wisc.edu!ctrsol!emory!stiatl!rsiatl!jgd From: jgd@rsiatl.UUCP (John G. De Armond) Newsgroups: news.admin Subject: Re: The Inode-Eating Bug Message-ID: <231@rsiatl.UUCP> Date: 4 Oct 89 21:25:29 GMT References: <906@zorch.SF-Bay.ORG> Reply-To: jgd@rsiatl.UUCP (John G. De Armond) Organization: Radiation Systems, Inc. (a thinktank, motorcycle, car and gun works facility) Lines: 369 In article <906@zorch.SF-Bay.ORG> scott@zorch.SF-Bay.ORG (Scott Hazen Mueller) writes: >Chalk up YA site being bitten by the System 5 lost inode bug. My vendor >doesn't do Unix anymore (that I know of), and it's not a terribly standard >system, so I have little hope of seeing the bug fixed. Since it is of >course hitting my news spool filesystem hardest, I would like to mitigate >the effects by hacking [ir]news to spool the entire batch on a low inodes >condition. The necessary code changes were entirely trivial; however, I >have no idea of whether they are really meaningful. Does anybody know how >the out-of-inodes condition progresses? that is, does a stricken system >decay steadily from the "real" inode count down toward 0 or 1, or do things >look normal until blammo! ifree is (<100|=1) or whatever? Perhaps a periodic >repost of one of the analyses of the problem would be useful. This problem is trivially easy to fix regarding news, the most common perpetrator of the problem. Following are a couple of scripts that completely solve the problem. A prerequisite is that your news spool system must reside on a separate, unmountable partition. These scripts were written by Paul Anderson (stiatl!pda or rsiatl!pda) and modified as needed by me. Simply install these scripts in a convenient place, ususally your news/bin directory, edit the paths, and put the cron file in. Then sit back and enjoy an "armchair" news system John ------------------------- script "inodes" ------------------------------------ # this shell script monitors free disk and runs the Clean script # if the inodes count should free disk fall below critical levels PANICLEVEL=1000 LIBD=/usr/lib/news/bin SPOOLD=/news/spool LOCK=/usr/spool/locks/LCK..news NEWSSLICE=/dev/dsk/0s4 CLEAN=$LIBD/Clean # check to see if anyone reading news, if so, then exit # format of /etc/fuser input record # /dev/dsk/0s4: 9999 9999 9999 9999 set `/etc/fuser $NEWSSLICE 2>&1 ` if [ $# -gt 1 -o -f $LOCK ] ; then exit 0 ; fi # format of df input record: # $1 $2 $3 $4 $5 $6 $7 # /news (/dev/dsk/0s4 ): 85064 blocks 27015 i-nodes df /news |\ while read partition device junk free junk inodes junk do if [ $inodes -lt $PANICLEVEL ] then echo Cleaning /news: $inodes free ---- `date` >>$LIBD/cleanlog $CLEAN >/dev/null fi done --------------------------------------------------------------------------- -------------------- Script "CLEAN" --------------------------------------- # Free up the inodes that are lost due to a system V bug LIBD=/usr/lib/news/bin SPOOLD=/news/spool LOCK=/usr/spool/locks/LCK..news NEWSSLICE=/dev/dsk/0s4 # wait until there are no processes running on the news disk # and wait for uuxqt to finish. Then set a lock # file so nothing else will happen. This locks out Uuxqt while { if [ ! -r $NEWSSLICE ] ; then exit 1 ; fi set `/etc/fuser $NEWSSLICE 2>&1 ` [ $# -gt 1 -o -f $LOCK ] } do sleep 15 done echo >$LOCK # then run the fsck $LIBD/fixnewsfs rm $LOCK ------------------------------------------------------------------------------ ---------------------------- Script NIGHTLY ---------------------------------- set -xv # %Z% %M% %I% # This script runs nightly. It does the news expiration. It also # runs an fsck on the /news partition to recover lost inodes. LIBD=/usr/lib/news/bin SPOOLD=/news/spool NEWSSLICE=/dev/dsk/0s4 EXPIRE=$LIBD/expire LOCK=/usr/spool/locks/LCK..news trap "exit" 0 1 2 3 15 # Before anything begins, wait for uuxqt to finish. Then set a lock # file so nothing else will happen. This locks out Uuxqt date while [ -f $LOCK ] do sleep 15 done echo >$LOCK date trap "rm -f $LOCK; exit" 0 1 2 3 15 # Save the old error log files $LIBD/Savelog >/dev/null 2>&1 # Expire old news # never,never,never expire RSI groups # blow away some stuff daily, since of little interest $EXPIRE -e 1 -n control # blow these away quickly since they are of time-based merit $EXPIRE -e 2 -n \ misc,\ !misc.jobs.offered,\ rec,\ !rec.humor,\ !rec.guns,\ !rec.arts.poems,\ !rec.ham-radio,\ !rec.ham-radio.all,\ soc,\ soc.motss,\ talk,\ !talk.bizarre,\ !talk.politics.guns,\ alt,\ !alt.sources $EXPIRE -e 7 -n \ alt.sources,\ !soc.motss,\ talk.bizarre,\ misc.jobs.offered,\ rec.arts.poems,\ rec.humor,\ rec.ham-radio,\ rec.ham-radio.all $EXPIRE -e 4 -n \ !misc.jobs.offered,\ rec.guns,\ talk.politics.guns,\ junk # computer groups will hang around for a bit longer. xwindows (my baby) # will hang around for a week. $EXPIRE -e 2 -n \ sci $EXPIRE -e 7 -n \ comp,gnu $EXPIRE -e 3 -n news # finally, if I forgot any newsgroups, blow them away after 1 week... $EXPIRE -e 7 -n all,\ !rsi.technical,\ !rsi.std,\ !rsi.std.rfc,\ !rsi.std.pfc # On the first of the month, run find to clean up the disk of all files # that were not purged by expire, then rebuild the history files... if [ `date +%d` -eq 1 ] then $LIBD/Monthly fi # When all done, run fsck to recover lost inodes (Sys V bug) # first wait until there are no processes running on the news disk #while { ## if [ ! -r $NEWSSLICE ] ## then ## echo *** WARNING: the news partition is not readable! ## rm -f $LOCK ## exit 1 ## fi # set `/etc/fuser $NEWSSLICE 2>&1 ` # [ $# -gt 1 ] #} do # sleep 15 #done # then run the fsck $LIBD/fixnewsfs # Now release the disk to Uuxqt rm -f $LOCK $LIBD/Useage | $LIBD/recnews general ------------------------------------------------------------------------------- --------------------------- script Monthly ------------------------------------ # Things to be done by news monthly LIBD=/usr/lib/news/bin SPOOLD=/news/spool cd $LIBD EXPIRE=$LIBD/expire # Find all junk files in the news spool directory and blow them away. # When done, rebuild the history files. find $SPOOLD \( -size 0 -o -mtime +15 \) -type f -print -exec rm -f '{}' \; $EXPIRE -r -e 99999 -E 99999 ------------------------------------------------------------------------------ ------------------------------- script Sendbatch ----------------------------- : '@(#)sendbatch.sh 1.10 9/23/86' # pda sendbatch: @(#) Sendbatch 1.2 # # 3/24/89 pda # added stuff to check [on sysv] for the number of batches that were # queued for this site. if the total data is larger than the specified # amount, then no more will be queued for this site... # MAXDATA is the variable for this. can be set using -m... # # also put in hooks for max disk space utilization...based on # results from df... # /usr (/dev/dsk/0s3 ): 289872 blocks 42251 i-nodes # $1 $2 $3 $4 $5 $6 $7 tmpfile=/tmp/news$$ cflags= LIM=50000 CMD='/usr/lib/news/bin/batch /news/batch/$rmt $BLIM' ECHO= COMP= C7= DOIHAVE= RNEWS=rnews SUMMER=/usr/lib/news/bin/Sumtosite MAXDATA=200000 MINFREE=50000 MININODES=10000 for rmt in $* do # Check for enough disk space on spool partition. if not enough, # then exit... df /usr >$tmpfile # changed to /news after system restore 09/25 read junk junk junk blocks junk inodes junk <$tmpfile rm $tmpfile if test \( $blocks -lt $MINFREE \) -o \( $inodes -lt $MININODES \) then echo $0: Out of Inodes or Free Disk. Not Run df /usr exit 1 fi case $rmt in -[bBC]*) cflags="$cflags $rmt"; continue;; -m*) MAXDATA=`expr "$rmt" : '-m\(.*\)'` continue;; -s*) LIM=`expr "$rmt" : '-s\(.*\)'` continue;; -c7) COMP='| /usr/lib/news/bin/compress $cflags' C7='| /usr/lib/news/bin/encode' ECHO='echo "#! c7unbatch"' continue;; -c) COMP='| /usr/lib/news/bin/compress $cflags' ECHO='echo "#! cunbatch"' continue;; -o*) ECHO=`expr "$rmt" : '-o\(.*\)'` RNEWS='cunbatch' continue;; -i*) DOIHAVE=`expr "$rmt" : '-i\(.*\)'` if test -z "$DOIHAVE" then DOIHAVE=`uuname -l` fi continue;; esac if test -n "$COMP" then BLIM=`expr $LIM \* 2` else BLIM=$LIM fi : make sure we have processed all switches before going on... if test $? -eq 0 then if test `$SUMMER $rmt` -gt $MAXDATA then echo Too much data queued for $rmt, not sending a batch... exit 1 fi fi : make sure $? is zero while test $? -eq 0 -a \( -s /news/batch/$rmt -o -s /news/batch/$rmt.work -o \( -n "$DOIHAVE" -a -s /news/batch/$rmt.ihave \) \) do if test -n "$DOIHAVE" -a -s /news/batch/$rmt.ihave then mv /news/batch/$rmt.ihave /news/batch/$rmt.$$ /usr/lib/news/bin/inews -t "cmsg ihave $DOIHAVE" -n to.$rmt.ctl < \ /news/batch/$rmt.$$ rm /news/batch/$rmt.$$ else (eval $ECHO; eval $CMD $COMP $C7) | if test -s /news/batch/$rmt.cmd then /news/batch/$rmt.cmd else uux - -r -z $rmt!$RNEWS fi fi done done ------------------------------------------------------------------------------- ---------------------------- crontab for News --------------------------------- # News' crontab # # run the nightly expiration of news # 30 6 * * * /bin/nice -15 /usr/lib/news/bin/Nightly # # Make sure there are enough inodes every 10 minutes # 7,17,27,37,47,57 2-23 * * * /bin/nice /usr/lib/news/bin/Inodes # # # Feed news to other sites 20 * * * * /bin/nice /usr/lib/news/bin/Sendbatch -c stiatl # ( you need an entry here for each site you talk to ) # ---------------------------------- end ---------------------------------------- -- John De Armond, WD4OQC | Manual? ... What manual ?!? Radiation Systems, Inc. Atlanta, GA | This is Unix, My son, You gatech!stiatl!rsiatl!jgd **I am the NRA** | just GOTTA Know!!!