Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Path: utzoo!decvax!decwrl!ucbvax!MITRE-BEDFORD.ARPA!mhg
From: mhg@MITRE-BEDFORD.ARPA.UUCP
Newsgroups: mod.computers.vax
Subject: Re: Request for info on VAX cluster failures.
Message-ID: <8607211412.AA08969@mitre-bedford.ARPA>
Date: Tue, 22-Jul-86 02:47:38 EDT
Article-I.D.: mitre-be.8607211412.AA08969
Posted: Tue Jul 22 02:47:38 1986
Date-Received: Wed, 23-Jul-86 00:39:19 EDT
References: <8607191323.AA08130@mitre-bedford.ARPA>
Sender: daemon@ucbvax.BERKELEY.EDU
Organization: The MITRE Corp., Bedford, MA
Lines: 24
Approved: info-vax@sri-kl.arpa


>How often does an entire VAX cluster crash? We're trying to build a system
>with sufficient redundancy to stay up practically all the time (2 VAXes
>identical in hardware and software, HSC50, DEC's disk shadowing), and don't
>know how paranoid to be about the possibility of the whole cluster's
>going out to lunch. Any hints?

In general, clustering reduces the chance of a crash significantly
(unless of course your DEC-Man just powers-down the HSC without any
advance notice...[It happened to us...]).  If one machine should happen
to crash it essentially becomes an unreachable node to the rest of the
cluster.  It would take quite a bit (short of a loss of power) to bring
an entire cluster down.

I know of one installation that went from one 780 to a cluster
consisting of a 780 and two 750's.  Before they clustered, the 780
would crash at least once a week.  Since clustering, they have had
almost no crashes.

Hope this helps.

Mark H. Granoff
ARPA: mhg@mitre-bedford
 DDD: (617) 271-8438