Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10.2 9/17/84 chuqui version 1.7 9/23/84; site nsc.UUCP Path: utzoo!watmath!clyde!burl!ulysses!mhuxr!mhuxt!houxm!ihnp4!nsc!chuqui From: chuqui@nsc.UUCP (Chuq Von Rospach) Newsgroups: net.news.group,net.news Subject: mailing lists vs. newsgroups: facts (was Vote Fraud and Newsgroups) Message-ID: <3221@nsc.UUCP> Date: Sat, 7-Sep-85 02:01:33 EDT Article-I.D.: nsc.3221 Posted: Sat Sep 7 02:01:33 1985 Date-Received: Sun, 8-Sep-85 04:21:00 EDT References: <828@burl.UUCP> <3500005@ccvaxa> Reply-To: chuqui@nsc.UUCP (Chuq Von Rospach) Followup-To: net.news.group Organization: Uncle Chuqui's Lemming Farm Lines: 126 Xref: watmath net.news.group:3682 net.news:3890 Summary: In article <3500005@ccvaxa> preece@ccvaxa.UUCP writes: >In what sense does a mailing list do a better job? (1) It is less >visible to new readers, since it isn't just there to be browsed on >every site. (2) The traffic still has to be passed along the route >to each reader, as mail. In some cases that will mean MORE net traffic >than if the notes had been passed as news. >I wonder how significant that is. Oh, I do so hate to put a damper on anargument, but lets try using facts for once and see what happens... The following formula shows the number of readers needed on a mailing list fof a newsgroup conversion to break even: list_readers = (sites_on_net*efficiency)/(increase*average_hops) The derivation of that formula is at the bottom of this article for those that want to check my math. The definitions are: o sites_on_net -- the number of sites a message in this newsgroup is distributed to. I'll use 1950 based on the following: I assume 2200 sites on the net. I assume 5% of those sites are local networks and transfer cost is 'free'. 5% of the sites turn the group off for some reason. That leaves you about 1950 sites. o increase -- the factor by which readership increases when converted to a newsgroup (1 is no increase, 2 is doubled, 4 is quadrupled, etc...). For the best case, lets assume readership quadruples, for the worst case, it merely doubles. o efficiency -- the efficiency advantage of news transport over mail which is shown as (1-%reduction_for_efficiency). No batching saves you 0%, batching with no compression is about 35%, and full compression is about 55-65%. The variance between worst case and best case is an estimate of the number of sites running various batching schemes, and worst case could (theoretically) be as low as 0% but lets use the range 35% to 65%. Because news feeds tend to be shorter distances than a lot of mail feeds, add another 10%. Worst case is then (1-.45) or .55 and best is (1-.75) or .25. o average hops -- the number of hops, on average, that a message in a mailing list needs to travel from the list to the recipient. Based on my two large mailing lists I've run (lan-news last year, nuke-winter this year) the average number of hops from my site to the person on the list is about 4. Let's use 3 for a best case and 5 for a worst case. o many mailing lists (mail.feminists, for instance) use intermediary distribution points to reduce the number of total hops. Mail.feminists has something like 200 people on it, but a lot of messages are sent out to sites that redistribute them further to keep the load down. This feature allows a list to support a lot more users before hitting the breakeven point. o large mailing lists can be digested, thereby reducing a lot of mail overhead by shipping fewer but larger messages, which also puts off the breakeven point (this could also be done by a mod.all group) Best case breakeven then becomes (1950*.25)/(4*3) or 40 people on the list. Worst case breakeven is (1950*.55)/(5*2) or 107 107 person on the list. In general, it looks like when the number of hits somewhere between 50 and 75 readers it makes sense to turn it into either a moderated group (if content regulation is of interest) or a net.all group (if you want a free-for-all). ===== Caveats ===== o Volume tends to be higher on a newsgroup. Also, there tends to be a higher amount of garbage because of the loss of moderation. If there is a reason to keep the garbage out, a moderator ought to be used with a mod.all group or the mailing list ought to be maintained. o hop_count_cost assumes netwide traffic. Certain sites (ihnp4 and other major mail gateways) would see higher traffic patterns because of a mailing list, leaf sites would see lower. o Many of those numbers are estimates. Your mileage may vary, especially the mailing list -> newsgroup audience increase. It may actually be as low as 1:1, and as high as infinity -- we have no data to work on. average_hops varies on how well connected the hub of a mailing list is, but even if they only talk to ihnp4 the average paths isn't much worse than 5. o With the exception of the fudge factor in the news efficiency, the increased cost of a long distance hop over a local hop is ignored. === breakeven formula generation === A hop_count_cost is considered to be the total_hops/list_readers For a mailing list, total_hops can be defined as (average_hops * list_readers) so the hop_count_cost becomes (average_hops * list_readers)/list_readers or average_hops. For a newsgroup, total_hops is defined as the number of sites on the net. list_readers needs to be extrapolated from the number of readers on the mailing list, and we throw in a fudge factor because transfer by batching in news is more efficient than shipping mail. The formula becomes: (sites_on_net*efficiency)/(list_readers*increase) Setting those two equations equal to each other, we can find the breakeven point. The formula is: average_hops = (sites_on_net*efficiency)/(list_readers*increase) which becomes list_readers = (sites_on_net*efficiency)/(increase*average_hops) and you solve for the number of readers that need to be on the list for a conversion to a newsgroup to break even. === final disclaimer === Putting together this article I have finally figured out why so few people bother with facts while arguing on the net. It took me about 2 hours to put the math together and a lot of thinking (in other words, work...) It is a lot easier to play with supposition and opinion, and I guess we get lazy after a while... chuq -- Chuq Von Rospach nsc!chuqui@decwrl.ARPA {decwrl,hplabs,ihnp4}!nsc!chuqui An uninformed opinion is no opinion at all. If you dont know what you're talking about, please try to do it quietly.