Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!mnetor!uunet!husc6!hao!ames!ucbcad!ucbvax!cartan!jiff.berkeley.edu!greg From: greg@jiff.berkeley.edu (Greg) Newsgroups: news.admin,news.misc Subject: Statistics by article Message-ID: <1393@cartan.Berkeley.EDU> Date: Mon, 9-Nov-87 19:49:47 EST Article-I.D.: cartan.1393 Posted: Mon Nov 9 19:49:47 1987 Date-Received: Wed, 11-Nov-87 21:36:39 EST Sender: nobody@cartan.Berkeley.EDU Reply-To: greg@jiff.berkeley.edu (Greg) Organization: U. C. Berkeley Lines: 57 Summary: Addressing technical issues. Xref: mnetor news.admin:1369 news.misc:1115 Greg (greg@maypo.berkeley.edu) writes: > rn would > monitor which articles each reader actually reads, and how long each > reader spends on those articles. Mark Brader writes: >The basic problem is that you can't tell, just because the image >remains on the screen for a long time, that the reader is really >reading that article. A good point. I propose some reasonable cutoff time, like five minutes, i.e. if a reader spend two hours on one article, five minutes is averaged in instead. There will still be some error from readers walking away from their terminals, performing shell escapes, and so on. I figure that the error will be roughly randomly distributed among articles in proportion to their popularity anyway; to some extent this variation will merely add a constant factor to the "true" statistics. It may also be more appropriate to report the median reading time rather than the mean. >In addition, readers may not want such monitoring [because of invasion >of privacy]. I find it difficult to fret over the fate of a list of numbers about me, without my name attached, that are promptly averaged in with thousands of other such numbers. I think most readers are the same way. The few who object are free to turn off the feature. I'd prefer the voting power to privacy. In any case, as with the Arbitron ratings, the Nielsens need not poll EVERY user on the net. >Besides that, admins might not care for the extra system load... Chuq Von Rospach also brought up the load issue, in the context of network load rather than system load: >If you figure 7,000 >people read usenet once a day (very low numbers! VERY low numbers) and the >package is 1,000 bytes, the receiving end needs to handle seven megabytes of >data a day. And these numbers are [ridiculous underestimates]... >It's a very nice idea. But from a technical point of view, it is a cure much >worse than the disease. I don't see why such an expensive scheme is necessary. In my scheme the network load for trading statistics is necessarily much smaller than the load of the news groups themselves. A news feed in a local network doles out articles to rn programs running on various hosts; the rn programs would reply with the stats for their news sessions. The news feed would then compress the statistics on its own by taking local averages and sums. Every week or so the Nielsen program would collect data from all of the news feed hosts. The data on all of the articles would be in the same report; there would be about 20-40 bytes of data per article. Estimating that the articles themselves are 1K long on average, the Nielsens would be only a small fraction of both the global network load and the local system load. -- Greg