Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Path: utzoo!utgpu!water!watmath!clyde!cbosgd!ihnp4!ptsfa!ames!amdcad!decwrl!reid
From: reid@decwrl.UUCP
Newsgroups: news.admin,comp.sources.d
Subject: readership measurement, arbitron, sources, etc.
Message-ID: <107@bacchus.DEC.COM>
Date: Fri, 30-Oct-87 02:54:54 EST
Article-I.D.: bacchus.107
Posted: Fri Oct 30 02:54:54 1987
Date-Received: Sat, 31-Oct-87 17:34:37 EST
Reply-To: reid@decwrl.UUCP (Brian Reid)
Distribution: world
Organization: DEC Western Research
Lines: 55
Xref: utgpu news.admin:1145 comp.sources.d:1370


I'm sorry to have offended some of you (e.g. ecl) and quite unsorry to have
offended others of you. Let me offer some comments.

One of my basic premises is that readership measurement is valuable to the
entire network, and that it is sufficiently valuable that I will keep doing
it even in the face of a certain amount of flamage.

I am constantly striving to get greater statistical accuracy in the
measurements. The main problem with the current set of sites that run
arbitron is that they are pretty much self-selected. This is why a small
sample is not enough. If I could take a genuinely representative
cross-section of the network, then a 1% sample would be plenty, but nobody
knows the demographics of the net and therefore a representative cross
section is impossible. More is probably better. The way to find out if it is
better is to get more data and see if the numbers change as a result of the
increase. When the percentage of sites running arbitron rose from 5% to 6%,
the readership data changed measurably. That tells me that 6% is not enough.

Although my recent posting offended certain people, it also provoked at least
50 new sites into providing the data. That makes it completely worthwhile as
far as I am concerned. I am not trying to get you all to love me or think I'm
wonderful. I'm trying to get more data, and I'll say whatever I have to say
in order to get it. If the price I have to pay to get data from 15% of the
net is that another 15% thinks I'm a total asshole, well, so be it. This is
not a popularity contest.

Some of you think that the readership data is useless or flawed. Fine. Feel
free to think that. One of the reason that I publish all of the algorithms
and explanations is to allow each of you to form your own opinion of the
worth of the data, rather than having to take my word for it. I continue to
believe that although the measurements are not perfect they are quite a lot
better than nothing, and given the current design of the news software, which
makes more perfect measurement impossible, that it's about the best that
can be done.

My main goal right now is to get more of the small sites and personal
machines to submit data. This is because I think that almost all of them
think that "my site has only 2 users, so we are statistically insignificant,
so I won't bother doing this". The problem is that when all of you think this
way, then small machiens are collectively unrepresented. As a result, the
statistics are skewed towards the behavior of readers of big machines, which
tend in general to be people who don't pay money for their news reading.

For 2 years I tried quietly cajoling people to submit the data and about 1
site in 20 did so. Last week I tried annoying people, to see if that would
give any better results, and it did. A 10% increase in the amount of data in
3 days. Clearly being obnoxious is a good strategy for right now. Sooner or
later it will stop working, too, and then I'll figure out some new strategy.

By the way, I figure it's just a matter of time until Bob Webber figures out
how to submit forged and fraudulent data and starts flooding the survey
software with it.

Brian