Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!mnetor!seismo!rutgers!ames!sdcsvax!darrell From: darrell@sdcsvax.UCSD.EDU (Darrell Long) Newsgroups: comp.os.research Subject: How do you tell if a remote site is alive? Message-ID: <3290@sdcsvax.UCSD.EDU> Date: Tue, 9-Jun-87 02:35:08 EDT Article-I.D.: sdcsvax.3290 Posted: Tue Jun 9 02:35:08 1987 Date-Received: Thu, 11-Jun-87 06:34:08 EDT Organization: University of California, San Diego Lines: 23 Keywords: networks, delay, ditributed systems Approved: mod-os@sdcsvax.uucp Here's a question for you: In a distributed system, how do you tell if a remote site is alive or not? A time-out could be used, but it's not reliable and its also very slow. I can think of many approximate solutions, but reliability is important. When constructing a distributed system, the network is the slowest component and presents the bottle-neck. From what I've read, most folks just assume that there is a way to tell if a remote site is dead. But, this information is very important to many algorithms. How is it done in practice? For us university-types, time-out is the usual approximation to a solution since we're usually just out to prove a concept and not to build a product. How do the folks in industry do it? Performance is critical there, unlike a university prototype. DL -- Darrell Long Department of Computer Science & Engineering, UC San Diego, La Jolla CA 92093 ARPA: Darrell@Beowulf.UCSD.EDU UUCP: darrell@sdcsvax.uucp Operating Systems submissions to: mod-os@sdcsvax.uucp