Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10.1 6/24/83; site decwrl.UUCP Path: utzoo!watmath!clyde!burl!ulysses!gamma!epsilon!zeta!sabre!bellcore!decvax!decwrl!jensen From: jensen@decwrl.UUCP (Paul Jensen) Newsgroups: net.unix-wizards,net.unix Subject: Re: VAXclusters and UN*X Message-ID: <2483@decwrl.UUCP> Date: Tue, 4-Jun-85 18:28:58 EDT Article-I.D.: decwrl.2483 Posted: Tue Jun 4 18:28:58 1985 Date-Received: Fri, 7-Jun-85 02:42:28 EDT References: <208@uwvax.UUCP>, <6211@umcp-cs.UUCP> Organization: DEC Western Research Lab, Los Altos, CA Lines: 74 Xref: watmath net.unix-wizards:13433 net.unix:4743 Following is a very brief tutorial on VAXclusters, and how they relate to unix*: A cluster is defined by a set of proprietary protocols for implementing a loosely-coupled multi-processing system. Two of the key protocols are System Communication Services (SCS), software which defines and coordinates members of the cluster; and the Distributed Lock Manager, which allows locks to be shared between processors. These protocols are entirely software-based there are no hardware dependencies in them except at the lowest levels. Also, the protocols are such that control is distributed dynamically between members of the cluster; in fact, there is no such thing as a "cluster controller" (the HSC50 is logically a peer of the VAX processors). The HSC50 is a high-speed IO server. It services requests for logical disk blocks. It does not know anything about file structure: this is imposed by the VAX processors via the MSCP protocol. The HSC50 performs various sorts of optimizations (similar to those done by the FFS) and has a peak transfer rate of nearly 4MB/sec. The RA-series disks are not dynamically dual-ported. Dual-porting was implemented in RA disks for the purpose of allowing the disk to be accessed by a secondary controller in the event the primary fails. In a cluster, a typical configuration would be a disk dual-ported between either 2 HSC50s or an HSC50 and a UDA50. Only one path will be active: in the event of the HSC50 failing, the alternate path will be dynamically failed-over to. DECnet is totally unrelated to clusters. It is possible to run DECnet over a CI bus (using SCS), but a cluster can run fine without a byte of DECnet code (it IS extremely useful for system management, however). Allowing a unix (or any other) system to participate in a cluster would require implementing at a minimun SCS, the connection manager (software which decides when to form, change, and dissolve clusters), the distributed lock manager, and MSCP. This is a large amount of code, much of it embedded in VMS (and therefore subject to VMS licensing restrictions), and porting it would be a major undertaking. A major re-write of the file system would be necessary, and adopting some sort of standard for file locking would be highly recommended. All the above work would just give you a distributed file system. If you wanted distributed job and device queues, you would have to implement the Distributed Job Controller as well. Given the VMS-ish flavor of this protocol, this task might be distasteful, not to mention non-standard. In conclusion, the bottom line shakes out as follows: o "cluster" of homogeneous UNIX systems with distributed file system only: technically feasible but a lot of work (>> 1 man-year). o the above with distributed queues: more work, problems with maintaining a standard version of unix Regards, --- Paul Jensen Digital Equipment Corporation ------------------------------------------------------------------------ Disclaimer: All information in this response is drawn from public sources. All opinions expressed are solely my own. In particular, I haven't the faintest idea of the future or current plans of either Ultrix or VMS engineering. *unix is a trademark of AT&T.