Path: utzoo!mnetor!uunet!husc6!mailrus!tut.cis.ohio-state.edu!allosaur.cis.ohio-state.edu!bob From: bob@allosaur.cis.ohio-state.edu (Bob Sutterfield) Newsgroups: comp.dcom.lans Subject: Re: NFS vs RFS (actually, vs Sprite and Andrew) Message-ID: <8428@tut.cis.ohio-state.edu> Date: 16 Mar 88 15:42:35 GMT References: <10370@ut-sally.UUCP> <720@uel.uel.co.uk> <1695@uoregon.UUCP> <45660@sun.uucp> Sender: news@tut.cis.ohio-state.edu Organization: The Ohio State University Dept of Computer & Information Science Lines: 65 Keywords: NFS, ANDREW, TOCS, consistency In article <45660@sun.uucp> nowicki%rose@Sun.COM (Bill Nowicki) writes: >In article <1695@uoregon.UUCP>, jqj@uoregon.UUCP (JQ Johnson) writes: >> ... >> Problems sited with NFS seem to fall into 2 categories: >> 1/ scalability to large systems, because of excessive network >> traffic, excessive server cpu loading, or difficulty in >> administration; > >Well of course the authors of systems are going to prefer their own. >This is called the "Not Invented Here" syndrome. As for scalability, >I would estimate about 100,000 installed systems running NFS. >Last I heard, about 500 run Andrew, and perhaps a few dozen run Sprite. Installed base != scalability. We have encountered scaling problems in existing implementations of NFS, and we worry about what happens when we hit scaling problems in the design. We have about 130 Suns in our department so far (headed for at least 250 by September), and each NFS-mounts nine filesystems from one of our servers. We only have one server with the horsepower for that: a Pyramid 98x. All our Sun-3/180s export only one user filesystem to the rest of the world, besides taking care of their direct clients. In a steady-state network with the usual occasional client bounce, all is well. But think about when a few subnets-full of clients - say 36 or so - are taken off the air while their ND servers/IP gateways-to-the-backbone are dumped to tape. When the clients are brought back up and those mount requests hit the central server, it's not pretty. nfsd has the horsepower, particularly when there are 8 of them running. inetd and portmap have no problems directing the traffic. But each time rpc.mountd tries to service a mount request, it has to sort through its in-core cache, then flush the state to /etc/rmtab. By the time rpc.mountd has worked down its request queue, the older ones have timed out. Then those clients retry. rpc.mountd thrashes and chews up CPU time with abandon for a few days. We have documented these effects in well-controlled tests, if you're interested. We have found that killing rpc.mountd, mv'ing /dev/null to /etc/rmtab, and starting a new rpc.mountd with -d will often get us rolling again for a while, until the thrashing starts again in a few minutes. In an hour or two of babysitting, we can get all our filesystems mounted on all those clients and life goes on. The problem seems to be the bottleneck of updating /etc/rmtab. We would remove that dependency (if we had our sources by now), which would only break showmount(8). Note that this is neither a philosophical problem with NFS, nor a Pyramid-specific bug: According to conversations at UNIForum with a person from Sun's NFS portability group, all known implementations of NFS have this feature. I'm satisfied it's a problem with the current implementations of the protocol, that can be fixed by sufficient beating on vendors, or a few minutes in a quiet room with the sources. But what design misfeatures might we encounter as we scale even bigger, that aren't so easily solved? Would someone from Berkeley or CMU care to comment upon any specific scaling problems {\em in the NFS design} that Sprite or Andrew would solve in a large network of diskless workstations? -=- Bob Sutterfield, Department of Computer and Information Science The Ohio State University; 2036 Neil Ave. Columbus OH USA 43210-1277 bob@cis.ohio-state.edu or ...!cbosgd!osu-cis!bob