Path: utzoo!attcan!uunet!ginosko!gem.mps.ohio-state.edu!apple!well!xanthian From: xanthian@well.UUCP (Kent Paul Dolan) Newsgroups: alt.sources Subject: Bandwidth Wasters Hall of Fame - The Code Message-ID: <13742@well.UUCP> Date: 22 Sep 89 11:32:53 GMT Reply-To: xanthian@well.UUCP (Kent Paul Dolan) Distribution: alt Organization: Whole Earth 'Lectronic Link, Sausalito, CA Lines: 403 Here's a slightly tongue in cheek bandwidth decreasing tool. Use it in good health. Please forgive my beta release shar program that adds a blank line at the end of each file, and then complains about it during unsharing; it doesn't seem to hurt anything. well!xanthian Kent, the man from xanth, now just another echo from The Well. #! /bin/sh # This is a shell archive. Remove anything before this line, then feed it # into a shell via "sh file" or similar. To overwrite existing files, # type "sh file -c". # The tool that generated this appeared in the comp.sources.unix newsgroup; # send mail to comp-sources-unix@uunet.uu.net if you want that tool. # If this archive is complete, you will see the following message at the end: # "End of shell archive." # Contents: bwhf.hype bwhf.csh bwhf1.awk bwhf2.awk bwhf.example_output # Wrapped by kent as a guest on Thu Sep 21 20:45:03 1989 PATH=/bin:/usr/bin:/usr/ucb ; export PATH if test -f 'bwhf.hype' -a "${1}" != "-c" ; then echo shar: Will not clobber existing file \"'bwhf.hype'\" else echo shar: Extracting \"'bwhf.hype'\" \(1011 characters\) sed "s/^X//" >'bwhf.hype' <<'END_OF_FILE' X BANDWIDTH WASTERS HALL OF FAME X X You've seen the postings, now read the code! X XHave a group of blowhards taken over your favorite newsgroup, with Xpostings of negligible content and awesome volume? X XIs it getting hard to cut through the chaff in your search to find Xthose grains of meaning? X XAre you mad enough to _take measures_? X XDo you wish you had a way to get, not just even, but ahead? X XWish no more! Here are the tools you need to publish your _very own_ XBandwidth Wasters Hall of Fame articles, and point the finger of X_public ridicule_ at the guilty parties. X XEnclosed are two awk scripts, and a cshell script to run them. These Xare for a BSD 4.3 system (Sun 4.0.3) with a really wimpy Ximplementation of awk. You may have to fiddle things a bit to make it Xgo on your system, but the basics are here. X XRead, enjoy, and most of all, use it to _nail the miscreants_! X XYours for an improved signal to noise ratio, X Xwell!xanthian XKent, the man from xanth, now just another echo from The Well. X END_OF_FILE echo shar: NEWLINE appended to \"'bwhf.hype'\" if test 1012 -ne `wc -c <'bwhf.hype'`; then echo shar: \"'bwhf.hype'\" unpacked with wrong size! fi # end of 'bwhf.hype' fi if test -f 'bwhf.csh' -a "${1}" != "-c" ; then echo shar: Will not clobber existing file \"'bwhf.csh'\" else echo shar: Extracting \"'bwhf.csh'\" \(1689 characters\) sed "s/^X//" >'bwhf.csh' <<'END_OF_FILE' X#!/bin/csh X# X# bwhf.csh by Kent Paul Dolan - Public Domain X# X# Bandwidth Waster's Hall of Fame master shell script; runs two awk X# scripts with a sort step between them. Set the execute bit on this X# file with chmod and put it in your path. It expects the two awk X# scripts to be in the current directory, and needs access to the X# "awk" and "sort" and "date" Unix(tm) commands. I don't know whether X# this command set would work under "sh", I didn't try it. X# X# usage: bwhf.csh X# X# example: bwhf.csh /usr/spool/news/alt/sources BWHF.alt.sources X# X# The first awk script accumulates the statistics for each author in X# an array, then dumps the array to a temp file for sorting. The X# [0-9] are to exclude subordinate directories from being processed as X# articles. X# Xawk -f bwhf1.awk ${1}/[0-9]* > /tmp/$$.bwhf.1 X# X# The sort step sorts on the bytes wasted column, numerically because it X# has leading blanks, and reversed because we want to list the worst X# bandwidth wasters first. X# Xsort -nr < /tmp/$$.bwhf.1 > /tmp/$$.bwhf.2 X# X# The second awk script prints a header, including the path to the X# newsgroup and the date, prints a line for each byte-burner, then X# prints a footer with a totals line and an apology for not including X# "beyond AI" capabilities in the output. X# Xawk -f bwhf2.awk newsgrouppath=$1 date="`date`" /tmp/$$.bwhf.2 > $2 X# X# Clean up the temp files - why wait for a reboot? X# Xrm /tmp/$$.bwhf.[1-2] X# X# You might want to put this back in, to preview the output before you X# send it off to your favorite newsgroup; it was giving me fits when I X# ran this script in background, so I commented it out. X# X#more $2 X END_OF_FILE echo shar: NEWLINE appended to \"'bwhf.csh'\" if test 1690 -ne `wc -c <'bwhf.csh'`; then echo shar: \"'bwhf.csh'\" unpacked with wrong size! fi chmod +x 'bwhf.csh' # end of 'bwhf.csh' fi if test -f 'bwhf1.awk' -a "${1}" != "-c" ; then echo shar: Will not clobber existing file \"'bwhf1.awk'\" else echo shar: Extracting \"'bwhf1.awk'\" \(4966 characters\) sed "s/^X//" >'bwhf1.awk' <<'END_OF_FILE' X# X# bwhf1.awk by Kent Paul Dolan - Public Domain X# X# Bandwidth Waster's Hall of Fame first awk script; finds article X# authors on "From:" line, credits them with the article and the bytes X# it contains, accumulates byte and article counts into arrays indexed X# by author (_love_ those associative array indices) , counts total X# bytes, lists bytes, byte share, articles, author's login, and any X# other author info from the "From:" line. X# X# Fails to merge postings from the same author at different sites, X# because it is not possible to distinguish the case of different X# people at different sites with the same login, and the same person X# and login from different sites, by mechanical means. X# X# This script is normally run by csh script bwhf.csh, but anyway, here is: X# X# usage: awk -f bwhf1.awk /[0-9]* > X# X# example: awk -f bwhf1.awk /usr/spool/news/alt/sources/[0-9]* temp1 X# X# where the [0-9]* takes care of the case of a newsgroup with articles X# which also has one or more subgroups (whose names won't start with X# [0-9]) X# X# Setup a couple of variables for file swapping control and multiple X# "From:" line detection. X# XBEGIN { X# X# use this to detect when we have changed files and need to start a X# new bytecount for a new file and save the old one to the old X# author's count. X# X lastfile = FILENAME X# X# Use this to avoid problems with multiple "From:" lines in the same X# article (not really needed, since awk zeros all variables at X# creation, but the code is a lot easier to comprehend with this in X# here): X# X sawfrom = 0 X } X# X# Although this is the first record processing code physically, X# logically it is not executed until the top of the second and X# subsequent articles of the input, therefore the "From:" code below X# has been executed once before this code. This pattern/action pair X# has to be up here to make sure that the bytecount and sawfrom fields X# are cleared before any other processing on second and subsequent X# articles. X# X# When the article for the current record has changed: X# X# Accumulate the byte count for the previous article for its author X# (saved as "from" in the "From:" pattern/action set); then clear the X# bytecount. Reset the lastfile item to the current file name, and X# clear sawfrom so that we are again looking for a "From:" line. X# Xlastfile != FILENAME { bytes[from] = bytes[from] + bytecount X bytecount = 0 X lastfile = FILENAME X sawfrom = 0 X } X# X# For every record (line) in the file (article), count it's bytes (the X# + 1 takes care of the '\n', which is ignored by "length($0)") into a X# total byte count for the file. X# X{ bytecount = bytecount + length($0) + 1 } X# X# One line in the article gets special processing: the _first_ "From:" X# line. If we haven't set sawfrom to 1 in this article, and this line X# _starts_ with "From:", then it is the one we want to identify the X# author of the article. Pull the login@site out of the second field X# as element "from" (the author ID), use it as an array index of an X# associative array "articles" to (possibly create with contents zero X# and) bump the article count for this author. Most authors' posting X# software includes a vanity ID after the login@site information. Use X# the index and substr commands to pull that off and store it too, X# indexed by author in associative array "fromtags" . The authors who X# use more than one vanity ID from the same site get the usage from X# the last of their articles. Set sawfrom to 1 (true) to avoid X# processing a second "From:" line where an article includes some of X# the header of another article without a protecting lead character. X# X/^From:/ && sawfrom == 0 { X from = $2 X articles[from]++ X ind = index($0,$2) + length($2) + 1 X fromtags[from] = substr($0,ind) X sawfrom = 1 X } X# X# After all the articles have been processed, we need to add the X# bytecount for the last article to the credit of the last wastrel, X# because we don't see another line to process through the "lastfile = X# FILENAME" pattern/action pair above, which does that crediting for X# all other articles but the last one. X# X# Loop through the associative byte count array by author to get a X# total byte count for all the articles, to use in determining an X# author's share of the total bandwidth waste. Use that information X# in a second loop which prints per-author summary information to X# calculate the share percentage field. For each author, print the X# bytes wasted, the waste share, the articles exuded, and the author X# ID and author vanity ID. X# X# The resulting file is ready for the sort step. X# XEND { bytes[from] = bytes[from] + bytecount X X for (from in articles) X { X bytestotal = bytestotal + bytes[from] X } X for (from in articles) X { X X printf("%8s %6.2f%% %4s %s %s\n", \ X bytes[from], \ X (bytes[from]*100)/bytestotal, \ X articles[from], \ X from, \ X fromtags[from]) X } X } END_OF_FILE echo shar: NEWLINE appended to \"'bwhf1.awk'\" if test 4967 -ne `wc -c <'bwhf1.awk'`; then echo shar: \"'bwhf1.awk'\" unpacked with wrong size! fi # end of 'bwhf1.awk' fi if test -f 'bwhf2.awk' -a "${1}" != "-c" ; then echo shar: Will not clobber existing file \"'bwhf2.awk'\" else echo shar: Extracting \"'bwhf2.awk'\" \(3187 characters\) sed "s/^X//" >'bwhf2.awk' <<'END_OF_FILE' X# X# bwhf2.awk by Kent Paul Dolan - Public Domain X# X# Bandwidth Waster's Hall of Fame second awk script; prints header, X# prints by-author lines and (re)accumulates byte and article totals, X# prints a footer showing the totals of bytes, share, and article X# counts. X# X# A "sort" step to sort the by-author lines in reverse bytes-wasted X# order should be run after the first script and before this one to X# rank the bandwidth wasters from most to least heinous, although this X# script is not dependent on the sort order of the input lines. X# X# This awk script is normally run by csh script bwhf.csh, but here is: X# X# usage: awk -f bwhf2.awk newsgrouppath=/usr/spool/news/whatever \ X# date="some-string" X# X#example: awk -f bwhf2.awk newsgrouppath=/usr/spool/news/alt/sources \ X# date="`date`" temp2 X# X# (the "\" means each of these is all supposed to be on one line) X# X# Start the header: X# XBEGIN { X printf("%55s\n", "BANDWIDTH WASTERS HALL OF FAME") X printf("%48s\n","for articles in") X } X# X# Finish the header: X# X# This has to be done at the first line, because until awk tries to X# read the first line, it hasn't seen the command line settings for X# newsgrouppath and date, so putting this in the BEGIN block failed. X# XNR == 1 { pformat = "%" int(40 + ((length(newsgrouppath) + 1) / 2) ) "s\n" X printf(pformat,newsgrouppath) X pformat = "%" int(40 + ((length(date) + 1) / 2) ) "s\n" X printf(pformat,date) X print "" X print " Bytes Volume Offending" X print " Wasted Share Articles Guilty Party" X print "" X } X# X# Accumulate the total bytes and total articles, and print each X# wastrel's contribution line: X# X# I was faking the share total to 100 percent, but then I thought a X# bit more. Now it is calculated, giving BWHF posters the chance to X# edit the sort output down to just the worst ten or so offenders, and X# pass just those records through this second awk script. My own X# expeience is that people just hate being omitted from the list, but X# your mileage may vary, so I changed the code to accomodate. X# X { bytestotal = bytestotal + $1 X# X# We have to strip off the trailing "%" from $2 to make a number: X# X share = substr($2,1,length($2)-1) X sharetotal = sharetotal + share X articlestotal = articlestotal + $3 X print X } X# X# Print the footer, consisting of a Totals line, an apology that this X# awk script doesn't do AI name matches for posters who use multiple X# sites, and a none too subtle piece of author puffery and general X# purpose mischief making. X# XEND { X print "-------- ------- ----" X printf("%8s %6.2f%% %4s Totals for %d authors\n", \ X bytestotal,sharetotal,articlestotal,NR) X print "" X print "(Roundoff fuzz may make total share not equal 100.00%)" X print "" X print "(Sorry, if you posted from more than one site, you got more" X print "than one entry. It's unavoidable; think about it! But even" X print "though your subtotals look smaller, we know who you are!)" X print "" X print "[A shar file of the scripts used to create this article was" X print "posted to alt.sources by the author, Kent Paul Dolan.]" X X } END_OF_FILE echo shar: NEWLINE appended to \"'bwhf2.awk'\" if test 3188 -ne `wc -c <'bwhf2.awk'`; then echo shar: \"'bwhf2.awk'\" unpacked with wrong size! fi # end of 'bwhf2.awk' fi if test -f 'bwhf.example_output' -a "${1}" != "-c" ; then echo shar: Will not clobber existing file \"'bwhf.example_output'\" else echo shar: Extracting \"'bwhf.example_output'\" \(1159 characters\) sed "s/^X//" >'bwhf.example_output' <<'END_OF_FILE' X BANDWIDTH WASTERS HALL OF FAME X for articles in X /usr/spool/news/alt/sources X Thu Sep 21 19:54:16 PDT 1989 X X Bytes Volume Offending X Wasted Share Articles Guilty Party X X 730081 44.23% 18 pokey@well.UUCP (Jef Poskanzer) X 294802 17.86% 6 mark@unix386.Convergent.COM (Mark Nudelman) X 195626 11.85% 5 lwall@jato.Jpl.Nasa.Gov (Larry Wall) X 149560 9.06% 1 raivio@procyon.hut.FI (Perttu Raivio) X X[30 lines of example output omitted to save bandwidth!] X X 496 0.03% 1 larrym@rigel.uucp (24121-E R Inghrim(3786)556) X 477 0.03% 1 garyc@quasi.tek.com (Gary Combs;685-2072;60-720;;tekecs) X-------- ------- ---- X 1650582 99.98% 67 Totals for 36 authors X X(Roundoff fuzz may make total share not equal 100.00%) X X(Sorry, if you posted from more than one site, you got more Xthan one entry. It's unavoidable; think about it! But even Xthough your subtotals look smaller, we know who you are!) X X[A shar file of the scripts used to create this article was Xposted to alt.sources by the author, Kent Paul Dolan.] END_OF_FILE echo shar: NEWLINE appended to \"'bwhf.example_output'\" if test 1160 -ne `wc -c <'bwhf.example_output'`; then echo shar: \"'bwhf.example_output'\" unpacked with wrong size! fi # end of 'bwhf.example_output' fi echo shar: End of shell archive. exit 0