Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!mnetor!uunet!husc6!cmcl2!brl-adm!adm!ultra!wayne@ames.arpa From: wayne@ames.arpa Newsgroups: comp.unix.wizards Subject: Open question on NFS, efficiency, etc. Message-ID: <8730@brl-adm.ARPA> Date: Mon, 10-Aug-87 14:40:34 EDT Article-I.D.: brl-adm.8730 Posted: Mon Aug 10 14:40:34 1987 Date-Received: Tue, 11-Aug-87 05:00:31 EDT Sender: news@brl-adm.ARPA Lines: 42 While waiting at my diskless Sun workstation for a "cp" to complete, an obvious question wandered into my otherwise empty head. When I do "cp a b" on my workstation under NFS, what is in fact happening is that disk blocks are being read on the server, forwarded over the Ethernet to my workstation, processed briefly, sent back across the Ethernet to the server, and finally written back to the same disk they started from; a rather tortuous path at best. So I did a quick set of timings on a 2.3 megabyte file: the local "cp" gave 0.0u, 4.0s, 1:26 elapsed. To get timings for running the "cp" directly on the server (via "rsh") I timed both locally ("time rsh server cp a b") and remotely ("rsh server time cp a b"); local was 0.0u, 0.3s, 0:21 elapsed and remote was 0.0u, 2.1s, 0:15 elapsed. I don't know how to time the server's part of a local "cp", but looking at "perfmeter" didn't show any significant difference in CPU usage. So in other words, the time perceived by the workstation user went from 86 seconds down to 21 seconds, with no apparent increase in server load! Given this, has anybody taken the obvious step of automating this process, sort of like having "cp" say "hmm, both these files are on the same server and the file is big enough to cover startup time; I'll just have him do the copy for me"? (I realize that "rcp" already does something like this for "remote-to-remote" situations, but then NFS is supposed to make things more transparent, right? And by the way, "time rcp server:a server:b" gave 0.1u, 0.8s, 0:36 elapsed; not as good as "rsh cp" but still a lot better than "cp"!) Anyway, I realize this is the tip of a rather large iceberg (the "when do you move the data to the computation and when do you move the computation to the data?" question), and I have in fact been chipping away at that iceberg myself (for example, I already have procedures set up to do lengthy "find"s on the server itself). But I was just curious if anybody has done anything "scientific" about this rather limited version of the general question? Or UNscientific, for that matter (like my proposed hacking of "cp" above). Sure seems like something obvious to look at for (e.g.) SunOS. Well, my "cp" is finally done, so back to work ... Wayne Hathaway ultra!wayne@Ames.ARPA Ultra Network Technologies 2140 Bering drive with a domain server: San Jose, CA 95131 wayne@Ultra.COM 408-922-0100