Xref: utzoo news.sysadmin:3752 news.software.b:7722 comp.unix.aix:5047 Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!usc!elroy.jpl.nasa.gov!ncar!gatech!mcnc!rti!mozart!kent From: kent@manzi.unx.sas.com (Paul Kent) Newsgroups: news.sysadmin,news.software.b,comp.unix.aix Subject: Re: IBM RS/6000 unsuitable for news Message-ID: <1991May10.173200.9135@unx.sas.com> Date: 10 May 91 17:32:00 GMT References: <1991May6.181144.23900@zoo.toronto.edu> <1F7k22w164w@halcyon.uucp> <1991May8.191430.6864@nmt.edu> <1991May09.143518.10260@unx.sas.com> Sender: news@unx.sas.com (Noter of Newsworthy Events) Organization: SAS Institute Inc. Lines: 134 Nntp-Posting-Host: manzi.unx.sas.com hello, apologies if this allready made it out to your site. i notied the first post hast distribution=sas which (i hope) would have restricted the distribution of the article. In article <1991May8.191430.6864@nmt.edu>, nraoaoc@nmt.edu (NRAO Array Operations Center) writes: >In article <1F7k22w164w@halcyon.uucp> halcyon!ralphs@seattleu.edu (Ralph Sims) writes: >>In an earlier post I mentioned that the average MS-DOS filesize for news >>articles appeared to be ~3K. Using a 4K blocksize would be fairly efficient >>under that condition. > >Not if you have hundreds of tiny articles and a few giant ones which skew the >average. > the discussion of the length of news articles allows me to kill two birds with one stone... i can contribute a sample distribution and make a shameless plug for SAS at the same time :-) ~80% of our news files are <2250 bytes.. food for thought, eh? the 30000 group seems to have a lump, as it really includes all files in the tail of the distribution. SAS's interface to unix pipes makes it easy to summarise the unix statistics in ways that were previously "challenging" if you limit yourself to (say) 10 punch cards this SAS job... ----------------------------- /*-- figure the stats on the lengths of the news files --*/ /*-- mozart is the computer where news lives.. --*/ /*-- i chose to awk the length field on the remote host--*/ /*-- to minimise net traffic. --*/ filename lenpip pipe "remsh mozart ""find /usr/spool/news -type f | xargs ls -l | awk '{print \$5}'"""; data len; infile lenpip; input len; run; options ps=60 ls=75; title2 'File Size Distribution for /usr/spool/news'; proc chart; hbar len /midpoints = 0 to 30000 by 500; run; ----------------------------- gets you this chart... The SAS System File Size Distribution for /usr/spool/news 1 LEN Cum. Cum. Midpoint Freq Freq Percent Percent | 0 | 14 14 0.03 0.03 500 |*********** 5671 5685 11.76 11.79 1000 |****************************** 14814 20499 30.72 42.51 1500 |*********************** 11347 31846 23.53 66.03 2000 |************ 6159 38005 12.77 78.80 2500 |******* 3392 41397 7.03 85.84 3000 |**** 1805 43202 3.74 89.58 3500 |** 1178 44380 2.44 92.02 4000 |** 841 45221 1.74 93.77 4500 |* 618 45839 1.28 95.05 5000 |* 381 46220 0.79 95.84 5500 |* 310 46530 0.64 96.48 6000 | 216 46746 0.45 96.93 6500 | 174 46920 0.36 97.29 7000 | 157 47077 0.33 97.62 7500 | 114 47191 0.24 97.85 8000 | 65 47256 0.13 97.99 8500 | 92 47348 0.19 98.18 9000 | 64 47412 0.13 98.31 9500 | 47 47459 0.10 98.41 10000 | 39 47498 0.08 98.49 10500 | 35 47533 0.07 98.56 11000 | 43 47576 0.09 98.65 11500 | 38 47614 0.08 98.73 12000 | 18 47632 0.04 98.77 12500 | 36 47668 0.07 98.84 13000 | 29 47697 0.06 98.90 13500 | 18 47715 0.04 98.94 14000 | 30 47745 0.06 99.00 14500 | 26 47771 0.05 99.05 15000 | 19 47790 0.04 99.09 15500 | 21 47811 0.04 99.14 16000 | 17 47828 0.04 99.17 16500 | 12 47840 0.02 99.20 17000 | 10 47850 0.02 99.22 17500 | 16 47866 0.03 99.25 18000 | 10 47876 0.02 99.27 18500 | 8 47884 0.02 99.29 19000 | 14 47898 0.03 99.32 19500 | 5 47903 0.01 99.33 20000 | 10 47913 0.02 99.35 20500 | 5 47918 0.01 99.36 21000 | 7 47925 0.01 99.37 21500 | 4 47929 0.01 99.38 22000 | 7 47936 0.01 99.40 22500 | 4 47940 0.01 99.40 23000 | 8 47948 0.02 99.42 23500 | 5 47953 0.01 99.43 24000 | 4 47957 0.01 99.44 24500 | 2 47959 0.00 99.44 25000 | 7 47966 0.01 99.46 25500 | 5 47971 0.01 99.47 26000 | 1 47972 0.00 99.47 26500 | 3 47975 0.01 99.48 27000 | 3 47978 0.01 99.48 27500 | 2 47980 0.00 99.49 28000 | 2 47982 0.00 99.49 28500 | 2 47984 0.00 99.50 29000 | 1 47985 0.00 99.50 29500 | 2 47987 0.00 99.50 30000 | 240 48227 0.50 100.00 --------+-------+-------+------ 4000 8000 12000 Frequency cheers, -- Paul Kent (SQL r&d) " nothing ventured, nothing disclaimed " kent@unx.sas.com SAS Institute Inc, SAS Campus Dr, Cary NC 27513-2414.