Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!csd4.csd.uwm.edu!cs.utexas.edu!uunet!legato!blyon From: blyon@legato (Bob Lyon) Newsgroups: comp.protocols.nfs Subject: Re: Speeding up PC-NFS Message-ID: <952@legato.LEGATO.COM> Date: 7 Sep 89 23:47:25 GMT Reply-To: blyon@Legato.COM (Bob Lyon) Organization: Legato Systems, Inc., Palo Alto, CA Lines: 107 In article 756@east.East.Sun.COM, Geoff Arnold says: ] In article joes@islsun3.mse.lehigh.edu (Joe Sieczkowski) writes: ] :: ] ::I recently moved my PC-NFS server from a fairly loaded 8Meg 3/260 to ] ::an unloaded 32Meg 4/280 (new). I expected a large performance ] ::increase; however over-all performance on a mounted drive only ] ::increased by a couple percent. This test was done on a PS2/50 with ] ::a 3c523 board. ] :: ] ::Is there anything I can do to increase the performance? The 4/280 ] ::is intended to be a dedicated server for several applications. ] :: ] ::I would appreciate any help. ] :: ] ::Joe ] ] I assume you're running 3.0.1 with the improved cache code. You are? ] Then the gating factor is the write access to the disk on the ] server. The best help here is the new Legato accelerator board. ] See comp.newprod or Unix Review for info. (Bob: d'you want to ] post anything here about it?) ] Yes. But first let me apologize for a late response. I hope that this is not too promotional... The Legato accelerator board, Prestoserve (tm, patent pending) is a hardware and software product that installs into any Sun VMEbus system that runs SunOS 4.0 or later. The VME hardware is mostly battery-backed, non-volatile, low power SRAM. How it works: The Prestoserve software is a UNIX driver that interposes itself between SunOS and the real device drivers. It intercepts synchronous write requests and caches them in the non-volatile memory. The Prestoserve driver uses a modified LRU algorithm to eventually, asynchronously flush its non-volatile cache to the actual disks. Because the major functionality is delivered via a driver, it installs easily into object only distributions. The driver's interposition guarantees that Prestoserve works properly with any (and every) disk and controller installed today. Why it works: The NFS protocol is stateless; NFS servers are *not* stateless, but to behave properly in the eventuality of crashes, NFS servers must commit state changes to non-volatile storage (normally a disk) before acknowledging state-change requests like write, create, remove, etc. One write request will turn into two synchronous devices writes - one to put the client's data on the disk, the other to update the file's inode information. If the file is large, then a third synchronous device write is required to update the file's indirect block. By copying what the UNIX kernel believes to be synchronous writes into NV RAM, Prestoserve gives back the UNIX file system performance that NFS took away. This is accomplished by: - write elimination: In any given interval only a small number of files are being modified. Writes of the inodes or indirect blocks inevitably "hit" a cached entry in the NV RAM. In practice, this eliminates 50% - 70% of all physical disk writes on the NFS server. (Although NFS modification operations are less than 15% of the NFS operation mix, they cause close to 70% of all IO on the server.) - IO staging: Writes that eventually goes to the actual disks are written asynchronously, allowing the disk drivers to schedule those writes in a manner that takes advantage of disk arm locations, rotational delays, and other pending IO requests. - Lower perceived latency: Since writes occur at memory speeds, NFS clients see write speeds equal to NFS read speeds. Is it real? Legato currently runs 10 diskless 3/50's, 4 diskless 3/80's, and two floppy PC-NFS (Release 3.0) machines off of one "presto-ized" 3/160 (Release 4.0.3) server with a Xylogics 451 controlling two 850 megabyte SMD Sabres and a Ciprico 3500 SCSI host adapter controlling two 580 megabyte SCSI Wren V's. Performance is great! We believe that the *CPU* will saturate at around 20 clients. A PC-NFS-based beta site (I cannot disclose their name) measured a 600% (!) performance increase after installing Prestoserve into their Sun 4/110 server. 600% is a bit unbelievable. However, we expect dramatic improvements to PC-NFS networks because PC applications typically write small, uncached blocks (e.g. 512 bytes). This means that for the PC-NFS client to write 8 Kbytes, the disk must perform 32 writes (or 48 writes to a large files) without Prestoserve. With Prestoserve the same task requires only 2 (or 3) asynchronous writes to occur. In conclusion, I am not surprised that Joe Sieczkowski is dismayed that his upgraded server only delivers 2% more performance. His upgrade addressed the instruction execution latency (which is not a big problem on Sun3-class servers); it did nothing to address the IO latency (measured in many milliseconds) which usually is the NFS server bottleneck. More information can be obtained from: Legato Systems, Inc. 260 Sheridan Ave. Palo Alto, CA 94306 415-325-2200 or email to prestoserve-request@Legato.COM, or sun!legato!prestoserve-request. Bob Lyon