Path: utzoo!utgpu!jarvis.csri.toronto.edu!rutgers!gatech!purdue!decwrl!shelby!unix!hplabs!hpfcdc!hpldola!hp-lsd!col!bdale
From: bdale@col.hp.com (Bdale Garbee)
Newsgroups: comp.sys.hp
Subject: Re: Experience sought with large HP 9000 clusters
Message-ID: <2220005@col.hp.com>
Date: 27 Jun 89 00:13:59 GMT
References: <3517@cps3xx.UUCP>
Organization: HP Colorado Springs Division
Lines: 65

>I would like to hear from anyone who has done this sort of thing
>with this number of machines.

We have several clusters with a lot of machines on them, where "a lot" is
defined as 16 or so.  I'll try to comment a bit.  Please recognize that I
am speaking from personal experience, not as a representative of HP... I write
instrument firmware for a living...

>The cluster servers will be 9000/360's with 12mb of memory and a fast
>and a slow SCSI interface. 

Not bad.  We tend to gravitate towards 350's and 370's as servers because of
a perception that the split bus architecture allows more DMA throughput to
I/O devices than on the 360.  Perhaps someone more authoritative will comment
on whether this is true or not.  We always configure servers with ECC RAM.
Even though it costs more, and parity errors are scarce, when one does happen
on the server the whole cluster is toast until it reboots.  They are rare
enough, that in your environment this may be a don't care.  Here, it's a
nightmare... emulator setups and such can be costly to reload/restart in terms
of engineering time.  We typically run 8meg of parity ram in clients, 16meg
for ME's and chip designers where the applications are large and hairy.  

>The cnodes will mostly be 9000/340's with 8mb of memory and a 150mb HP7958B
>disk on the HPIB interface. The cnodes will all be configured for local swap.

Tasty!  We run a mix of 320/350/360 clients.  The 320's are slow, everything
else seems more than ok.

>Is this a reasonable number of cnodes per cluster? 

It'll work.  Your expectations for disk performance may be much different
from ours, depending largely on the relationship between time spent compiling
and time spent sitting in an editor, or sitting in frame, or something else
that isn't I/O intensive.  We tend to limit ourselves to 16 seats per cluster,
with *nothing* running on the server except Sendmail, etc.  As long as the
load stays below 1 on the server, all seems quite pleasant.  You for sure
should configure your lan with a bridge per cluster, the server and clients
on their own thin strand... you should be ok.  And if you're not, come back
later and add another server or two, and move clients around.  120 clients on
a single strand is a bad idea.

>Has anyone experienced problems running out of process ids in a large cluster?

Not the way you mean.  We typically up the nproc and maxuprc (I think) params
in the client kernels to allow more processes than the default, since we used
to bang heads running X11 and lots of windows.  The defaults may be more
rational now, I don't know.  The global process number space seems to be large
enough, at least for our clusters.  Never had a problem...

>Does anyone have a workaround for the inability to put spooled devices, e.g.,
>printers, on cnodes? 

Sure.  Use a named pipe.  On the client, set up an inittab entry to cat stuff
from the named pipe to the physical device, on the server tell the spooler to
use the named pipe.  This is explicitly not supported, but local experience is
that it works ok... I forget who suggested this to me originally...  It should
also be possible to un-CDF /usr/lib/lpsched.  Easiest would be to go to the
server and cd to /usr/lib/lpsched+, then move remoteroot out of the way and
link it to localroot, which would allow the scheduler to run on the clients as
well.  Naming all of the printers differently within a cluster should handle
all of the possible conflicts... but I like the named pipe solution better
because you aren't dorking with something an OS update will break, and there's
only one copy of the scheduler to lose sleep over.

Bdale