Path: utzoo!utgpu!water!watmath!clyde!ima!bbn!bbn.com!rsalz From: rsalz@bbn.com (Rich Salz) Newsgroups: news.admin Subject: Re: Arbitron "Users" count is very wrong under BSD Message-ID: <308@fig.bbn.com> Date: 21 Jan 88 23:30:39 GMT References: <15538@onfcanim.UUCP> <1978@s.cc.purdue.edu> <13348@oliveb.olivetti.com> Organization: BBN Laboratories, Cambridge MA Lines: 96 -In article rsk@s.cc.purdue.edu.UUCP (Rock Wombat) writes: -...is there a general way to answer the question -"How many active users does this machine have?" Not really, short of hardcoding a number into your arbitron script; which you might not want to rule out-of-hand... In news.admin, jerry@oliveb.UUCP (Jerry Aguirre) writes: >multiple times. What I really need is a way to merge the data from the >5 systems and submit a single report with duplicates merged. Here's a script that basically does it, feed it a bunch of reports. #! /bin/sh ## arb-merge. Read set of arbitron reports and merge them. ## This needs to be made portable and configurable the way the real arbitron ## stuff is, but for now... it works. HOST=`hostname` DATE=`date | sed -e 's/....\(...\).*19\(..\)/\119\2/'` cat $@ | awk '\ BEGIN { # Are we ignoring the current system (e.g., duplicate)? Ignore = 1 # Total number of users and readers for all systems. Users = 0 NetReaders = 0 # List of systems whose reports we have processed. SysCount = 1 SysList[0] = "--ERR--" # List of newsgroup names and count thereof. GroupCount = 1 GroupName[0] = "--ERR--" # Associative array of number of readers, indexed by group name. GroupReaders[0] = "--ERR--" # Associate array of "seen this newsgroup?", indexed by group name. HaveGroup["--ERR--"] = "no" } $1 == "Host" { # We assume there are not lots of hosts, so no associative array. Ignore = 0 for (i = 1; i < SysCount; i++) if (SysList[i] == $2) Ignore = 1 if (Ignore == 0) { SysList[SysCount] = $2 SysCount++ } } $1 == "Users" { if (Ignore == 0) Users += $2 } $1 == "NetReaders" { if (Ignore == 0) NetReaders += $2 } $1 == "ReportDate" { # We could (should?) check for a bad date here, and ignore reports, # but that means we might have to back out of bumping the Users and # NetReaders somehow -- not worth it. } $1 ~ /[0-9]+/ && $2 ~ /comp\.|misc\.|news\.|rec\.|sci\.|soc\.|talk\./ { if (Ignore == 0) if (HaveGroup[$2] != "yes") { HaveGroup[$2] = "yes" GroupName[GroupCount] = $2 GroupCount++ GroupReaders[$2] = $1 } else GroupReaders[$2] += $1 } END { printf "99999 Host\t\tHOST\n" printf "99998 Users\t\t%d\n", Users printf "99997 NetReaders\t%d\n", NetReaders printf "99996 ReportDate\tDATE\n" printf "99995 SystemType\tnews-arbitron-2.4\n" for (i = 1; i < SysCount; i++) printf "99994 OtherHost\t%s\n", SysList[i] for (i = 1; i < GroupCount; i++) print GroupReaders[GroupName[i]], GroupName[i] }' \ | sort -nr \ | sed -e "s/HOST/${HOST}/" -e "s/DATE/${DATE}/" -e "s/9999[0-9] //" We don't use it at BBN yet, but eventually I'll have some daemon on all major servers run arbitron and mail the results to me or a program, or have clients just mail me their .newsrc anonymously... -- For comp.sources.unix stuff, mail to sources@uunet.uu.net.