Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!watmath!clyde!rutgers!princeton!allegra!ulysses!ucbvax!CRNLNS.BITNET!SYSTEM From: SYSTEM@CRNLNS.BITNET.UUCP Newsgroups: mod.computers.vax Subject: re: Load balancing on a cluster Message-ID: <8702030555.AA16656@ucbvax.Berkeley.EDU> Date: Mon, 2-Feb-87 11:12:00 EST Article-I.D.: ucbvax.8702030555.AA16656 Posted: Mon Feb 2 11:12:00 1987 Date-Received: Wed, 4-Feb-87 03:03:11 EST Sender: daemon@ucbvax.BERKELEY.EDU Organization: The ARPA Internet Lines: 57 Approved: info-vax@sri-kl.arpa Geoff, There is no automatic load-balancing between cluster members. Once a job starts on a particular cpu, it stays on that cpu. All main memory is private. The cluster members are only connected to one another by high speed (70 megabits per second per cable pair) serial communications hardware. If you want real load balancing with shared memory then you have to buy a "tightly-coupled" multiprocessor. The dual processor systems that DEC currently sells are the VAX 8300 and 8800. (The VAX-11/782 is no longer actively marketed.) Terminal servers only provide a very coarse level of balancing, in that they can be used to provide a default login to the system that is the least "busy" at that time. DEC's measure of "busyness" is not necessarily one that a user would agree with. Batch and print "load balancing" is done by the system manager starting a Generic batch queue that everyone submits jobs to, and it will feed jobs to any corresponding cpu specific queue. For example, the following commands define system specific batch queues on systems LNS61 and LNS62, then start a generic batch queue for people to submit jobs to. Nothing keeps anyone from submitting jobs to system specific queues. If both cpu specific queues are idle at the time a job is submitted to the generic queue, then the job will always start on the queue which has the name that comes first "alphabetically": on 5MIN_A62 in this example. $ write sys$output "Starting 5 minute Batch queues" $! $ INITIALIZE/QUEUE/BATCH/ENABLE_GENERIC/ON=LNS62::/START- /PROTECTION=(S:E,G:R,O:D,W:RW)/JOB_LIM=1/BASE=4- /WSDEFAULT=100/WSQUOTA=500/WSEXTENT=600- /CPUDEF=0:05:00/CPUMAX=0:05:00 - 5MIN_A62 $! $ INITIALIZE/QUEUE/BATCH/ENABLE_GENERIC/ON=LNS61::/START- /PROTECTION=(S:E,G:R,O:D,W:RW)/JOB_LIM=1/BASE=4- /WSDEFAULT=100/WSQUOTA=500/WSEXTENT=600- /CPUDEF=0:05:00/CPUMAX=0:05:00 - 5MIN_B61 $! $ INITIALIZE/QUE/BATCH/GENERIC=(5MIN_a62,5MIN_b61)/START- /PROTECTION=(S:E,G:R,O:D,W:RW) - 5MIN I hope this helps. Selden E. Ball, Jr. Cornell University NYNEX: 1-607-255-0688 Laboratory of Nuclear Studies BITNET: SYSTEM@CRNLNS Wilson Synchrotron Lab ARPA: SYSTEM%CRNLNS.BITNET@WISCVM.WISC.EDU Judd Falls & Dryden Road PHYSnet/HEPnet/SPAN: Ithaca, NY, USA 14853 LNS61::SYSTEM = 44283::SYSTEM (node 43.251)