Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!utgpu!water!watmath!clyde!cbosgd!ihnp4!ptsfa!ames!amdcad!decwrl!reid From: reid@decwrl.UUCP Newsgroups: news.admin,comp.sources.d Subject: readership measurement, arbitron, sources, etc. Message-ID: <107@bacchus.DEC.COM> Date: Fri, 30-Oct-87 02:54:54 EST Article-I.D.: bacchus.107 Posted: Fri Oct 30 02:54:54 1987 Date-Received: Sat, 31-Oct-87 17:34:37 EST Reply-To: reid@decwrl.UUCP (Brian Reid) Distribution: world Organization: DEC Western Research Lines: 55 Xref: utgpu news.admin:1145 comp.sources.d:1370 I'm sorry to have offended some of you (e.g. ecl) and quite unsorry to have offended others of you. Let me offer some comments. One of my basic premises is that readership measurement is valuable to the entire network, and that it is sufficiently valuable that I will keep doing it even in the face of a certain amount of flamage. I am constantly striving to get greater statistical accuracy in the measurements. The main problem with the current set of sites that run arbitron is that they are pretty much self-selected. This is why a small sample is not enough. If I could take a genuinely representative cross-section of the network, then a 1% sample would be plenty, but nobody knows the demographics of the net and therefore a representative cross section is impossible. More is probably better. The way to find out if it is better is to get more data and see if the numbers change as a result of the increase. When the percentage of sites running arbitron rose from 5% to 6%, the readership data changed measurably. That tells me that 6% is not enough. Although my recent posting offended certain people, it also provoked at least 50 new sites into providing the data. That makes it completely worthwhile as far as I am concerned. I am not trying to get you all to love me or think I'm wonderful. I'm trying to get more data, and I'll say whatever I have to say in order to get it. If the price I have to pay to get data from 15% of the net is that another 15% thinks I'm a total asshole, well, so be it. This is not a popularity contest. Some of you think that the readership data is useless or flawed. Fine. Feel free to think that. One of the reason that I publish all of the algorithms and explanations is to allow each of you to form your own opinion of the worth of the data, rather than having to take my word for it. I continue to believe that although the measurements are not perfect they are quite a lot better than nothing, and given the current design of the news software, which makes more perfect measurement impossible, that it's about the best that can be done. My main goal right now is to get more of the small sites and personal machines to submit data. This is because I think that almost all of them think that "my site has only 2 users, so we are statistically insignificant, so I won't bother doing this". The problem is that when all of you think this way, then small machiens are collectively unrepresented. As a result, the statistics are skewed towards the behavior of readers of big machines, which tend in general to be people who don't pay money for their news reading. For 2 years I tried quietly cajoling people to submit the data and about 1 site in 20 did so. Last week I tried annoying people, to see if that would give any better results, and it did. A 10% increase in the amount of data in 3 days. Clearly being obnoxious is a good strategy for right now. Sooner or later it will stop working, too, and then I'll figure out some new strategy. By the way, I figure it's just a matter of time until Bob Webber figures out how to submit forged and fraudulent data and starts flooding the survey software with it. Brian