Path: utzoo!news-server.csri.toronto.edu!rpi!zaphod.mps.ohio-state.edu!samsung!olivea!mintaka!spdcc!dirtydog!suitti From: suitti@ima.isc.com (Stephen Uitti) Newsgroups: comp.protocols.nfs Subject: Re: Incremental sync()s and using disk idle time Message-ID: <1991Mar18.173211.28474@ima.isc.com> Date: 18 Mar 91 17:32:11 GMT References: <28975@cs.yale.edu> <10773@dog.ee.lbl.gov> <3236@crdos1.crd.ge.COM> <1991Mar12.194704.17859@zoo.toronto.edu> <3253@crdos1.crd.ge.COM> Sender: usenet@ima.isc.com Reply-To: suitti@ima.isc.com (Stephen Uitti) Organization: Interactive Systems, Cambridge, MA 02138-5302 Lines: 96 In article <3253@crdos1.crd.ge.COM> davidsen@crdos1.crd.ge.com (bill davidsen) writes: >In article <1991Mar12.194704.17859@zoo.toronto.edu> henry@zoo.toronto.edu (Henry Spencer) writes: > >| The only sensible place to put smarts is in host software, where it can >| be changed to match the workload and to keep up with new developments. >| "Smart" hardware almost always makes policy decisions, which is a mistake. >| The money spent on "smartening" the hardware is usually better spent >| on making the main CPU faster so that you can get the same performance >| with smart *software*... especially since a faster CPU is good for a >| lot of other things too. Why not be able to load the "smart controller" with whatever code you want? At system boot time, or whenever, just send the code you want to the controller. If the controller CPU and architecture is reasonably easy to work with, it should be able to keep up with the OS. For that matter, it could be usable by more than one OS - and optimized for each. You really have a programmable I/O processor. Let's say you put much of the filesystem into such a controller. For UNIX, the host would still have to have the FS code, for other controllers & network filesystems (although, the smart controller might talk to the network itself - offloading the CPU further). Anyway, once done, you could compare performances & see what you'd gained, if anything. You might find, for example, that I/O throughput is lower, but more main CPU bandwidth is available. Some sites may be so I/O bandwidth limited, that they would be better off with a less smart controller. No problem - just download simpler code & tell the OS to treat it as such. You'd still have a controller with lots of RAM for buffer cacheing - possibly allowing the host buffer caches to be reduced. Maybe what you'd find is that I/O bandwidth is up, but latency is longer. Maybe what you'd find is that everything is faster. Maybe you'd find that in order to make the controller better than the host CPU, you need to use a better CPU than the host. Who knows? Maybe you only need a z80 class CPU with good DMA & lots of RAM. Of course, if you find you need the same CPU that the host uses, you may as well make it part of a symetric multiprocessor - unless you want to guaranty minimum I/O latency, or something. You'd certainly have the flexibility to tune the system for database applications. What target applications are we talking about? "General computing" is pretty meaningless. The applications I've seen are: Text editing. Publishing (requires graphics) Software development Database processing of various kinds CAD/CAM Data crunching (mad scientist) One could identify systems aimed at each of these, and various combinations. It is quite likely that the smartness of the disk controller will be more or less relevant for each application. On a Mac, a draw program can easily become screen-bound. The disks may be quite slow, and not be problematic to the user. Floppies, at 10 KB/sec may be fast enough (small files, few accesses) for a single user. A smart disk controller might be a complete waste of money. > There's also an issue of reliability. > [handwaving deleted]. Reliability can be achieved through redundant hardware and redundant software. Software can run diagnostics, generate error correction codes, perform redundant checks. In hardware, redundant power for controller memory [and disks] can improve the chances that an error will not happen. Multiple disks - mirror disks, can improve head crash tolerance. One system uses over 32 disks - each disk takes on one bit of a 32 bit word, plus drives for error correction codes. Any drive can fail with no loss of data. Bandwidth is also improved. Reliability is costly. Users who can live with less of it, often do. > If I can issue an i/o and tell when it's done, and if the controller >is configured to insure that data don't sit in the cache for more than >time X (you define X), then I don't see any problem providing ordered >writes as needed for data security, and good performance as needed by >loaded and i/o bound machines. That's what I mean by a smart controller >and that's what I think is optimal for both performance and cost >effectiveness. Lets say you have a controller that has its own buffers, which are backed up in some way in case of power-fail. Let's say that the policy is to tell the CPU that the data is on disk as soon as it is in the buffer. Who cares when the data makes it to disk? Who cares what order it happens in? The operating system knows what data order was required, and sends the requests in that order. The buffers + disk reflect what was desired, and if there is a main power failure, new requests are not procesesed (obviously, power is out). When the system comes back up, the buffers can be flushed to disk. For that matter, the buffers may still be valid for cache.