Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!uunet!europa.asd.contel.com!sura.net!haven!adm!news From: pjw@usna.navy.mil, , jw@math30, (Peter J. Welcher (math FACULTY)) Newsgroups: comp.unix.wizards Subject: load sharing Message-ID: <25860@adm.brl.mil> Date: 6 Feb 91 16:22:07 GMT Sender: news@adm.brl.mil Lines: 38 I have a question, especially for the academic readers of the group. (And it may just be I'm missing the obvious, or re-inventing a wheel.) The Naval Academy Math Dept has 28 Suns, mostly in faculty offices. We'd like our students to be able to run Mathematica and Matlab on them by logging in via PC's running Procomm, connected via Ethernet. We do that already for a few students, no sweat. I'm worried about handling lots of students, all trying to do homework the night before it is due. The question is, is there any easy way to perform load-sharing, other than by randomly assigning sections or students to hosts ? What I think I'd like to do is perhaps tell students to log into a certain host (say math3) and then have them randomly be rlogin-ed to another machine before the program (Mathematica, Matlab) is run. Is there any reason this is a bad idea ? My thought is the rlogin load will be relatively low, so going thru a common machine won't overload it too badly. (And it's a SPARC server, 32M memory, with unlimited user license.) Writing a script that does something with rwho is a possiblity, but there's all the net overhead to rwhod. (28 machines). I do want something that completes within say 5 to 10 seconds, so rusers, rup and the like are no good. I've written a C program that forks (to get around timeout delays) and then does rstat calls. It is called "loaddist". It kills processes that don't finish within a short time, and then prints the name of the least loaded host (with some other fudge factors thrown into the calculation, like Sun 3 vs. SPARC). My idea was to have "rlogin `loaddist`" done to the students when they log into the specified host, math3. Is this a good/bad idea ? An alternative would be to set "loaddist" up as a daemon, to reduce the possible amount of forks and net traffic. The daemon would, say, fork a query to one host per second, so that all information would be refreshed every 30 seconds or so. The student script would use signals to get "loaddist" to emit a hostname. Any comments or suggestions would be appreciated.