Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!tut.cis.ohio-state.edu!purdue!bu-cs!bloom-beacon!eru!luth!sunic!mcsun!ukc!dcl-cs!aber-cs!pcg From: pcg@aber-cs.UUCP (Piercarlo Grandi) Newsgroups: comp.arch Subject: Re: X-terms v. PCs v. Workstations Summary: The bottleneck is hard thinking... Message-ID: <1498@aber-cs.UUCP> Date: 30 Nov 89 14:38:58 GMT Reply-To: pcg@cs.aber.ac.uk (Piercarlo Grandi) Organization: Dept of CS, UCW Aberystwyth (Disclaimer: my statements are purely personal) Lines: 161 In article <1989Nov28.204639.11237@jarvis.csri.toronto.edu> jonah@db.toronto.edu (Jeffrey Lee) writes: quiroz@cs.rochester.edu (Cesar Quiroz) writes: > THE PROBLEM IS NOT TECHNOLOGICAL. The problem is educational and sociological/political, agreed. Too bad that too many CPU/system architecture guys assume that system administrators out there have the same understanding of issues as they have, and can be expected to second guess them, and can carefully tune it which is plainly preposterous. I remember reading Stonebraker, reminiscing on Ingres, and saying that he had chosen static vs. dynamic indexes because careful analysis indicated that would be slighlty more efficient, and then discovering that ingres DBAs would not realize they needed to rebuild the static index after updates had accumulated, even if this was clearly and insistently documented in the manuals. This has made him a firm believer in foolproof, automatic self tuning technoliogies, even if they are less effective and imply higher overheads than those that require handcrafting, because the latter is simply all too often not available. And then not everything can be made foolproof: One of the camps has a predominantly decentralized (fileserver plus workstation) model. General users are only allowed to login to workstations leaving the file servers to manage NFS requests plus one other task (mail, nameservice, news, YP, etc.) Performance used to be miserable in the middle of the day with constant NFS-server-not-responding messages and *slow* response for swapping. This because your sysadmin does not understand diskless machines, and their performance implications. A 40 meg SCSI disc can be had for $450 mail order, and attached to a diskless workstation to be used for swapping and /tmp (or /private) works incredible wonders. Swapping across the net, or doing /tmp work across the net is intolerably silly. When you edit a file on the classic diskless you copy it over block by block to the workstation from your user filesystem, and then back block by block to the /tmp filesytem, across the net both ways, tipically from/to the same server as well (while one should at least put every workstations's /tmp and swap on a different server than that used for user files and root)... One of the greatest jokes I have recently seen is this NFS accelerator that you put in the *server*, that costs $6995 (reviewed in Unix Review or World, latest issue), which is about the same as for 15 (and I would certainly not want to put over 15 workstations on single server, as ethernets are bad for rooted patterns of communication) 40 meg disks you can add to each of the *workstations*, thus cutting down dramatically on network traffic and enjoying reduced latency and higher bandwidth as well. But of course, sysadmins all over will rush to buy it, because of course it is *servers* that need speeding up... Hard thinking unfortunately is more expensive than kit. *Large* workstation buffer caches (e.g. 25% of memory) also help a lot; if you have Sun workstations, they come with especially bad defaults, as to the ratio of buffer headers to buffer slots, which is 8 and should be 1-2, a crucial point that clearly Sun has misconfigured to make customers waste memory. The other camp has workstations and X-terminals. It allows users to login to one of two combination file/compute server machines. The workstations are mostly used to run an X server with remote xterms for login sessions. A wonderful waste of resources... Do you *really* want to have a couple of network X transactions (and context switches, and interrupts, etc...) for each (actually, thanks to X batching, every bunch of) typed/displayed character, e.g. under any screen editor? Why not use the workstation for local editing and compiles? All CPU time is charged at the SAME hourly rate to encourage users to run their jobs on the fastest machines. The reason: swapping over NFS ^^^^^^^^^^^^^^^^^ Which, as I said before, is as inane a thing as you can possibly do... puts a higher load on the servers than running the same jobs on the central machines. Also, more gain can be had from adding 32MB to each server than adding 4MB to each workstation. Not to mention that multiple workstations waste memory in having multiple kernels and not sharing executable images, that have tended to grow both ever larger. On the other hand, memory is not the only bottleneck. Multiple workstations have multiple net interfaces, disc controllers, CPUs, whose work can overlap in real time. Both camps are relatively open on quotas: they are imposed from below. You (or your sponsor) are charged for your CPU, modem/workstation connect time and disk usage. When your disk partition fills up, you can't create any more files. [Any everyone in your partition will gang up on you if you are at the top of the disk usage list.] Using partitions as a quota mechanism is incredibly gross. Partitions should be as few as possible and only to group files with similar backup requirements. Charging for resource use is the best quota mechanism you have, because it is self adjusting (if a resource is too popular, you can use the fees from its use to expand it). It also simplifies administration in that users can waste disc space as they feel like, as long as they pay for it (the alternative is the sysadmin educating the users as to how economize on disc consumption, which only happens if the sysadmin knows more about the issue than the users). Both camps provide centralized backup, system administration, and expertise. ^^^^^^^^^ As to this I have some doubts... :-( Both camps are expanding their facilities. The centralized computing camp is planning to add more memory to one of the central machines which is starting to get bogged down with swapping. This may not be going to help. Adding faster swapping discs should be the answer, and in particular having multiple swapping partitions on *distinct* and fast controllers. If the working sets of all the active programs already fit into memory, that is. Otherwise, adding memory to keep core resident some inactive working sets just to avoid swapping them out/in to slow discs/controllers is less sound than improving swapping device bandwidth altogether. The net result: our experience show that (our brand of) centralized computing is a win over diskless-workstations. The net result is that people like the many contributors to this newgroup work very hard to squeeze performance out of state-of-the-art technology by devising clever architectures, but the single greatest problem users have as to performance out there is that system administration is a difficult task, and nearly nobody does it properly, whether the system is centralized or distributed. The tipical system administrator tipically does not understand the performance profiles of machines, much less of systems or networks, and does not have a clue as to what to do to make the wonder toys run, except to ask for more kit. PS: if you want to know what I think is the most effective configuration, I think that may be small workstations (say a 4-8 meg, 2-4 MIPS engine) with a small swap and /tmp disc (say 20-40 megs, 25 msec.) and large buffer caches (say 1-2 megs, with 1024-2048 buffer headers), with large compute servers (say 8-16-32 megs, 8-16 MIPS) again with a large swap and /tmp/disc (say 80-160 megs, 20 msec.), and small workstations with very fast disc controllers (say E-SMD/fast SCSI/IPI boards, 2-4 megs/sec, 2 of them) as file servers with user filesystems (say 4 300-600 meg discs), and users filesystems split across the fileservers (say 8-12 workstations per server), and the shared parts of / and /usr replicated identically on the fileservers to split the load. Users would do editing and compiles on the local workstations, and run large applications on the compute servers. I would also like some X terminals around for users, rather than developers, or for small time developers. A centralized solution is second best, as it has less inherent parallelism and resiliency, but may be somewhat easier to administer. Setups with many diskless (a misnomer -- so called diskless really have remote discs) workstations and a single fileserver, usually with a single large disc, as I have seen so often, have the worst performance possible, and terrible administrative problems, while setups with many diskful workstation each with their independent set of discs (tipically just one) etc... don't have great performance, and are as difficult to administer. -- Piercarlo "Peter" Grandi | ARPA: pcg%cs.aber.ac.uk@nsfnet-relay.ac.uk Dept of CS, UCW Aberystwyth | UUCP: ...!mcvax!ukc!aber-cs!pcg Penglais, Aberystwyth SY23 3BZ, UK | INET: pcg@cs.aber.ac.uk Brought to you by Super Global Mega Corp .com