Path: utzoo!attcan!uunet!yale!husc6!bu-cs!bzs From: bzs@bu-cs.BU.EDU (Barry Shein) Newsgroups: comp.unix.wizards Subject: Re: Vax 11/780 performance vs Sun 4/280 performance Message-ID: <23288@bu-cs.BU.EDU> Date: 13 Jun 88 15:56:30 GMT References: <22957@bu-cs.BU.EDU> <14968@brl-adm.ARPA> <601@modular.UUCP> <7331@swan.ulowell.edu> <2282@rpp386.UUCP> <6926@cit-vax.Caltech.Edu> Organization: Boston U. Comp. Sci. Lines: 78 In-reply-to: mangler@cit-vax.Caltech.Edu's message of 13 Jun 88 08:58:03 GMT >How do they get that kind of throughput? I refuse to believe that it's >all hardware. Mainframe disks rotate at 3600 RPM like everybody else's >and their 3 MB/s transfer rate is only slightly higher than a SuperEagle. >A 2-MIPS CPU would be inadequate to run a BSD filesystem at those speeds, >so obviously their software overhead is a lot lower, while at the same >time wasting no disk time. What is VM doing efficiently that Unix does >inefficiently? > >Don Speck speck@vlsi.caltech.edu {amdahl,ames!elroy}!cit-vax!speck I think a lot of it *is* hardware. I know the big mainframes better than the small ones. I/O devices are attached indirectly thru channel controllers. Channels have their own paths to/from memory (that's critical, multiple DMAs simultaneously.) Also, channels are intelligent, I remember people saying the channels for the 370/168 had roughly the same computing power as the 370/158 (ie. one model down, sort of like saying that Sun3/280's use Sun3/180's as disk controllers, actually the compute power is very similar in that comparison.) Channels execute channel commands directly out of memory, sort of linked list structs in C lingo, with commands, offsets etc embedded in them (this has become more common in the mini market also, the UDA is similar tho I don't know if it's quite as general.) Channels can also do things like search disks for particular keys, hi/lo/equal, without involving the central processor. I don't know how much this is used in the various filesystems, obviously a general data base thing. The channels themselves aren't all that fast, around 3MB/sec, but 16 of them pumping simultaneously to/from different blocks of memory can certainly make it feel fast. I heard IBM recently announced a new addition to the 3381 disk series (these are multi-GB disks) with 256MB (1/4 GB) of cache in the disk. Rich or poor it's better to be rich. The file systems tend to be much simpler (they avoid indirection at the lower levels), at least in OS, which I'm sure contributes to the performance, I/O is very asynchronous from a software perspective so starting multiple I/Os is a natural way to program and sit back waiting for completions. Note that RMS in VMS tries to mimic this kind of architecture, but no one ever accused a Vax of having fast I/O. A lot of what we would consider application code is in the OS I/O code, known as "access methods", so reading various file formats (zillions, actually, VSAM, ISAM, BDAM, BSAM...) and I/O disciplines (VTAM etc) can be optimized at the "kernel" level (there's also microcode assist on various machines for various operations), it also tends to push applications programmers towards "being kind" to the OS, things like pre-allocation of resources is pretty much enforced so a lot of the dynamic resource management is just not done during execution. There is little doubt that to get a lot of this speedup on Unix systems you'd have to give up niceties like tree'd directories, extending files whenever you feel like, dynamic file opening during run-time (OS tends to do deadlock avoidance rather than detection or recovery so it needs to know what files you plan to use before your jobs starts, that explains a *lot* of what JCL is all about, pre-allocation of resources), etc. You probably wouldn't like it, it would look just like MVS :-) You'd also have to give up what we call "terminals" in most cases, IBM terminals (327x's) on big systems are much more like disks, half-duplex, fill in a screen locally and then blast entire screens to/from memory in one block I/O operation, no per-char I/O. Emacs would die. It helps, especially when you have a lot of terminals. I read about an IBM transaction system with 15,000 terminals logged in, I said a lot of terminals. But don't underestimate raw, frothing, manic hardware. It's a big trade-off, large IBM mainframes are to I/O what Crays are to floating point, but you really have to have the problem to want the cure, for most folks it's unnecessary, MasterCard etc excepted. -Barry Shein, Boston University