Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!utgpu!water!watmath!clyde!rutgers!sri-spam!ames!sdcsvax!darrell From: darrell@sdcsvax.UUCP Newsgroups: comp.os.research Subject: Re: How do you tell if a remote site is alive? Message-ID: <3302@sdcsvax.UCSD.EDU> Date: Thu, 11-Jun-87 21:47:42 EDT Article-I.D.: sdcsvax.3302 Posted: Thu Jun 11 21:47:42 1987 Date-Received: Sat, 13-Jun-87 11:01:16 EDT Sender: darrell@sdcsvax.UCSD.EDU Organization: Hewlett Packard, Colorado Networks Division Lines: 36 Approved: mod-os@sdcsvax.uucp > How is it done in practice? For us university-types, time-out is the usual > approximation to a solution since we're usually just out to prove a concept > and not to build a product. > > How do the folks in industry do it? Performance is critical there, unlike a > university prototype. In much (user-level) software, it's still done by timeouts. Even in the Sun NFS kernel there are layers of timeouts on top of the UDP protocol (they use UDP because it's fast on a small LAN, and then put timeouts of 1 second on top of it ... strange, huh?). In one distributed application I read about, hosts periodically send out sanity checks which have (host#,state) pairs, either just for themselves or for all hosts they know about. When a host discovers that another host is down, it modifies its state information for that host; in the next broadcast, all other hosts learn of the dead host. When it comes back up it tells everyone "I'm alive again". This gets very expensive when there are a lot of hosts on the network, but it does keep other hosts from having to timeout. Not sure of any other methods, you might want to check to see what LOCUS did -- I know they're not a commercial product (or are they?), but it seemed like a well done system to me. I wish I knew of more better methods (than timeouts), but I don't. I'd be very interested in hearing what you discover in this area. Thanks, and happy researching (from the land of development!). -- jad -- John A Dilley Hewlett Packard Co. Colorado Networks Division Fort Collins, COlorado 80525 ARPA: jad%hpcndm@hplabs.HP.COM UUCP: {ihnp4,hplabs} !hpfcla!jad