Path: utzoo!utgpu!news-server.csri.toronto.edu!rutgers!tut.cis.ohio-state.edu!VAX1.CC.UAKRON.EDU!mcs.kent.edu!usenet.ins.cwru.edu!odin!chet From: chet@odin.INS.CWRU.Edu (Chet Ramey) Newsgroups: comp.unix.wizards Subject: Re: (was slashes, now NFS devices) Message-ID: <1991Mar11.223836.27934@usenet.ins.cwru.edu> Date: 11 Mar 91 22:38:36 GMT References: <124235@uunet.UU.NET> <1991Mar5.005612.29292@usenet.ins.cwru.edu> <1991Mar5.023954.2738@murdoch.acc.Virginia.EDU> Sender: news@usenet.ins.cwru.edu Reply-To: chet@po.CWRU.Edu Organization: Case Western Reserve Univ. Cleveland, Ohio, (USA) Lines: 203 Nntp-Posting-Host: odin.ins.cwru.edu I wrote: ># >#``Either NFS will have to be changed or NFS will have to be scrapped.'' ># - John Ousterhout And Greg Hennessy called me on it: >Fine. I'll bite. What does Sprite do better than NFS? Vice Versa? I don't have the time to really do this subject justice, but here are a few differences between NFS and the Sprite file system, with a little extra info thrown in at the end. ENVIRONMENT The Sprite FS is geared towards server/client local nets, with mostly diskless clients. All machines are assumed to have lots and lots of memory (LOTS and LOTS). NFS works on just about everything. NAMING SCHEME Sprite presents a single network-wide Unix file system tree. In other words, complete transparence. As a consequence, remote devices can be accessed from any machine without the problems that NFS has. The Sprite file system hierarchy is composed of separate subtrees called `domains', each in charge of some portion of the name space. A server for a domain handles all name lookups for that domain. The Sprite kernel uses `prefix tables' to map domains to servers. This mapping is built and maintained dynamically using a special broadcast protocol. These tables look kind of like Prefix Server Token / S 42 /fs T 1 /fs/1 U 23 When a full pathname is resolved by the Sprite kernel, it matches the longest prefix from its prefix table and ships the rest of the pathname off to the corresponding server for resolution. NFS uses mount/unmount, and requires work to present each machine with the same view of the network name space. INTEGRATION WITH VM The Sprite FS is integrated with the virtual memory system. The VM system uses ordinary files for backing store for running processes. The VM and FS share the available memory for their respective caches; memory is dynamically proportioned between the two. It is not uncommon to have most of a machine's memory used for the file system cache if it is not needed for running programs. I'll let Piercarlo Grandi talk about NFS and the SunOS VM system. ;-) UNIX SEMANTICS Sprite provides exact Unix file system semantics. It is explicitly stateful -- open and close are in the protocol. It has a fancy cache consistency scheme that makes all this possible. We all know about NFS. CACHING Sprite has an elaborate caching scheme. Both clients and servers cache file blocks, which are currently 4K in size. Servers also cache file maps and directories. File blocks are cached using virtual addresses (file id and block number) rather than absolute disk addresses. Sprite uses a delayed-write policy similar to that of the Unix file system. Every 30 seconds, blocks not modified in the previous 30 seconds are written back to the server. A block written to a client's cache will be written to the server's cache within 30-60 seconds and will be written from there to disk within another 30-60 seconds. NFS uses write-through, and flush-on-close. Big performance win for Sprite here. Sprite makes much stronger consistency guarantees than NFS. It guarantees that any client reading data from a file will always see the latest data, regardless of when and where that data was written, even in the presence of multiple concurrent writers. A Sprite server uses callbacks to disable all client caching on a per-file basis in the case of concurrent multiple writers, which basically reduces to the NFS write-through scheme. Readers are not allowed to cache when this happens either, so all read requests go back to the server. This is stronger than NFS. Sprite servers return a token from an open protocol request indicating whether or not caching of that file is allowed, and use the aforementioned callback scheme to disable caching later. Sequential write sharing, where several clients open, write, and close a file in turn, is similarly guaranteed. NFS does not provide consistency in the case of multiple simultaneous writers. It provides consistency when sequential write sharing occurs only if the interval between each open and close is longer than the NFS consistency-probe period (NFS clients ask the server whether files they have cached have been modified periodically; this period is the consistency-probe period). Big win for Sprite in consistency. FAULT TOLERANCE NFS wins here, since this is its primary benefit. The one place where you want a stateless server and write-through. SCALABILITY NFS does not really scale well. Sprite scalability is limited by its use of a broadcast protocol and its use of RPC over a special-purpose kernel-to-kernel protocol. AFS beats the pants off both. PERFORMANCE According to benchmarks published in papers by the Sprite developers, Sprite is a clear winner performance-wise. Big surprise. The Sprite developers claim that it is 10-40% faster than NFS, with a server utilization of about 1/4 that of NFS (a benefit of the caching). They expect Sprite to be able to handle about 50 clients per server. EXTRAS This doesn't really count, since these are things that NFS does not attempt to solve. They are neat, though. Sprite has `pseudo-devices', which are like watchdogs. A pseudo- device appears in the file system name space and behaves like a regular file or device, but operations on the file are actually forwarded to a user-level process which can implement them in any way it chooses. Sprite uses pseudo-devices to implement its terminal drivers, an X server, and TCP/IP. Pseudo-file-systems can transparently extend the Sprite file system to include foreign file systems, such as NFS. They're like pseudo- devices, but on a file system level. MORE INFO These are some of the papers that explain all of this better. You can get most of these from sprite-request@sprite.berkeley.edu. [1] ``The Sprite Network Operating System'' Ousterhout, Cherenson, Douglis, Nelson, and Welch IEEE Computer, 2/88 [2] ``Caching in the Sprite Network File System'' Nelson, Welch, and Ousterhout ACM TOCS v6 #1 (2/88) [3] ``Prefix Tables: A Simple Mechanism for Locating Files in a Distributed System'' Welch, Ousterhout Proceedings of 6th International Conference on Distributed Computing Systems, May, 1986 [4] ``Virtual Memory for the Sprite Operating System'' Nelson UCB/CSD report 86/301 June, 1986 [5] ``Virtual Memory vs. the File System'' Nelson DECWRL report 90/4 [6] ``Spritely NFS'' Srinivasan and Mogul DECWRL report 89/5 [7] ``Pseudo-Devices: User-Level Extensions to the Sprite File System'' Welch and Ousterhout Proceedings of the Summer 1988 Usenix Conference [8] ``Pseudo-File-Systems'' Welch and Ousterhout [9] ``Why Aren't Operating Systems Getting Faster as Fast as Hardware?'' Ousterhout Proceedings of the Summer 1990 Usenix Conference (also DECWRL Technical Note TN-11) Chet -- Chet Ramey ``Now, somehow we've brought our sins Network Services Group back physically -- and they're Case Western Reserve University pissed.'' chet@ins.CWRU.Edu My opinions are just those, and mine alone.