Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10.3 (USS@Tek, v1.1) based on 4.3bsd-beta 6/6/85; site zeus.UUCP Path: utzoo!watmath!clyde!burl!ulysses!bellcore!decvax!decwrl!pyramid!hplabs!tektronix!teklds!zeus!bobr From: bobr@zeus.UUCP (Robert Reed) Newsgroups: net.news,net.news.group Subject: Re: net readership poll (discussion) Message-ID: <89@zeus.UUCP> Date: Fri, 28-Mar-86 17:54:13 EST Article-I.D.: zeus.89 Posted: Fri Mar 28 17:54:13 1986 Date-Received: Tue, 1-Apr-86 05:22:27 EST References: <5192@glacier.ARPA> <1994@hao.UUCP> <5249@glacier.ARPA> Reply-To: bobr@zeus.UUCP (Robert Reed) Organization: CAE Systems Division, Tektronix, Inc., Beaverton, OR. Lines: 35 Xref: watmath net.news:4723 net.news.group:5313 > Some groups have very low volume, such that it is possible for no articles > to be current in the group when the survey is run. If that were the case, > the survey would show no readers when in fact many people may read the > group. > David Eppstein, eppstein@cs.columbia.edu, seismo!columbia!cs!eppstein > What effect, if any, is there from hosts that do not permit access to all > newsgroups? > wmartin@brl-smoke.ARPA (Will Martin ) These are both valid concerns if the number of individual samples is small, but as the sample size increases, both of these anomalies will get lost in the noise. The major problems in such a survey are: 1. The possibility of error in the collection mechanism. For example, if there was a bug in the posted arbitron script, such that every site which reported had intrinsic and random errors in the reported data. A mere systematic error (i.e., consistently reporting half the number of readers in each site report), if detected, could be accounted for and weighted out of the sample. One possible source for this kind of error exists in the nature of the responders. If arbitron is run without root priviledges, readers whose home directories or .newsrc files are protected are counted as users but not readers. Similarly, sites which have a set of machines, with accounts for all users but with prefered home machines for partitions of this user set, will have a similar skew in the user/readership ratio. But in either of these cases the effect is systemic, reducing the percentages with without skewing towards any particular newsgroup. 2. Lack of sample size. Most of the complaints about the readership poll have been concerns about skew from set of samples which do not reflect the interests of the complaintant. The easiest fix is to increase the sample set size.