Path: utzoo!utgpu!news-server.csri.toronto.edu!mailrus!uunet!mcsun!ukc!dcl-cs!aber-cs!pcg From: pcg@aber-cs.UUCP (Piercarlo Grandi) Newsgroups: comp.arch Subject: Re: Single user vs. shared Message-ID: <1720@aber-cs.UUCP> Date: 11 Apr 90 21:48:59 GMT Reply-To: pcg@cs.aber.ac.uk (Piercarlo Grandi) Organization: Dept of CS, UCW Aberystwyth (Disclaimer: my statements are purely personal) Lines: 105 In article <1990Apr10.225542.13662@world.std.com> bzs@world.std.com (Barry Shein) writes: From: pcg@aber-cs.UUCP (Piercarlo Grandi) >In article <8840010@hpfcso.HP.COM> dgr@hpfcso.HP.COM (Dave Roberts) writes: > In school we had a lab full of Sun 3/50s which were all diskless (via NFS) > to a server. There were about 50 machines on an ethernet which worked > >Note that 50 machines to a single server is *crazy*. I would not go over a >dozen; and even with multiple servers I think that 50+ hosts doing heavvy >traffic on a single Ethernet requires some careful analysis. Gee, Piercarlo, do you ever work from facts forward rather than the other way around? He said the 50 workstations worked fine except during peak load (finals), what else is new? Every utility on earth is set up this way. So you say the set-up is crazy? Why? Because it worked? Because it offends your intuitive sensibilities? The setup is crazy because it collapses ungracefully under load. Almost anything works well if it is used for a fraction of nominal; the system engineer is the guy that makes thing work even under load. The problem with a 50 workstation ethernet is that its knee is reached very quickly as the more workstations become significantly active. There three possible alternatives that do not guarantee a meltdown: 1) A single large, 50 users, machine with local fast discs, as it would not have wire contention and network overheads. 2) 5 segments each with 10 diskless and a small server would not have wire contention, because we expect cross segment transaction to be very rare. 3) A wire with 50 diskful workstation would not experience network contention, nor network overheads. It is a damn interesting research problem to find a performance profile of each of these solutions for various loads, and a cost profile, and compare them. It is not an interesting research problem to discuss configuration with in-built narrow bottlenecks. I thought compiling hasn't been disk intensive for years, it's CPU intensive. Tell that to Borland! Their compilers are neither... :-). Or maybe :-(. It depends on how inefficient and stupidly built is the compiler. Based on my impressions, I'd say that pcc derived or inspired compilers tend to be disk traffic intensive, while those with glocal optimizers tend to be memory intensive, and thus again usually disc traffic (paging!) intensive. If you have infinite memory, either for caching disc blocks, or for avoiding paging, then both types of compilers obviously tend to become CPU *bound*, rather than intensive. Of course, if you have infinite resources, any solution will do. Yet, compile times are often fairly "short", and with lots of IO instead, especially in development environments where you don't optimize but generate large symbol tables. Does anyone have measurements? Very precious few, for the distributed case. For the local case, and some inferences, however haphazard, can be extrapolated, we have more data; the landmark paper on disc caching by J Smith, and a few others on the performance characterization of Unix disc access. We also have some interesting timings for network communications (the CACM one on efficient RPC on ethernet, even if old, the one on the galloping bits syndrome, the Amoeba ones, etc...). All these papers are well known, I assume. That doesn't stop you from running with this lead and drawing conclusions based on it. To be frank, I don't trust your intuitions. I'd rather see some data. Perhaps that's rude. I'd like to see it as well. I know people are working on that. On the other hand I think good arguments can be built out of known facts: 1) Ethernet has a well known problem (understatement of the decade) as soon as average utilization gets over 30-50%. 2) The total conceivable bandwidth of an Ethernet is just over 1MB/sec, but only when just two stations are using it, and if the receiving one can accepts full size back to back packets without ovveruns. 3) Each network transaction takes about 3-5ms. on your typical UNIX machine (from kernel buffer to kernel buffer); it may take much more, depending on various misdesigns, and on whether you are instead measuring program to program times. 4) A diskless workstation being actively used generates about 10-20KB/sec of network traffic, and about 10/20 packets/second. 5) Many Ethernet boards and their interface software cannot sustain *input* rates anywhere near the theoretical maximum. In particular there is a limit to the number of packets/sec. that can be read by many machines. 6) On average, if users are doing mostly editing, one user in 10 has an active process (but then, why ever give them a workstation each?). If they are mostly compiling this ratio worsens substantially. I will let the interested readers draw their own conclusions based on back of the envelope arithmetic everybody can do. -- Piercarlo "Peter" Grandi | ARPA: pcg%cs.aber.ac.uk@nsfnet-relay.ac.uk Dept of CS, UCW Aberystwyth | UUCP: ...!mcvax!ukc!aber-cs!pcg Penglais, Aberystwyth SY23 3BZ, UK | INET: pcg@cs.aber.ac.uk