Path: utzoo!mnetor!uunet!husc6!ncar!ames!nrl-cmf!cmcl2!brl-adm!adm!mike@BRL.ARPA
From: mike@BRL.ARPA (Mike Muuss)
Newsgroups: comp.unix.wizards
Subject: UNIX on Cray, COS, etc.
Message-ID: <12452@brl-adm.ARPA>
Date: 17 Mar 88 07:13:04 GMT
Sender: news@brl-adm.ARPA
Lines: 130

William -

In your recent message to UNIX-Wizards, you make some rather bold claims
that I would like to remark on.

>> Another thing I've heard is that UNICOS (Cray's UNIX) is HUGE and SLOW
>> (compared to COS); besides, are you really going to run a program that 
>> takes 5 hours (of CPU time) to run interactively?  We have a Cray X-MP/24
>> running COS at the UT System CHPC (Center for High Impedance Computing)
>> that is backed up for WEEKS on some of the larger job classes.  There are
>> some jobs which *couldn't* be run under UNICOS (on this machine) because
>> UNICOS would take up more memory than COS.

Cray's UNICOS for the XMP is neither "HUGE" nor "SLOW".  At BRL we operate
a Cray X-M/P48 with 3 CPUs of COS and 1 of UNICOS, and a Cray-2, in
addition to an assortment of other machines, and I believe I can offer
you a genuine datapoint. In terms of a performance difference between
COS and UNICOS, there isn't much.  Native UNICOS actually holds a small
but significant advantage in terms of I/O performance (typically around
10% faster file I/O, which rises to 1000% faster file I/O for
small-to-medium transfers to SSD).  This is a pretty neat result,
considering that lots of COS is hand-coded CAL (Cray Assembler), while
the UNICOS kernel remains almost entirely C.

Compute performance is not operating system-specific, but instead
compiler-specific, and Cray provides substantially the same compilers
under both systems.  Run times are typically very close. Differences in
runtimes, when they occur, are typically due to COS and UNICOS being at
slightly differing revision levels in the compilers.

The UNICOS kernel on our XMP is configured for a large load, and uses
176 Kwds total for it's resident image,  disk, terminal, and network buffers.
The balance of the memory is available for user problems.  Considering
this figure in BYTES, looking *today*, I find these numbers:

	VAX 4.3 kernel:		4.2 Mbytes resident (2.6 Mbytes of buffers)
	Gould UTX2.0:		2.3 Mbytes resident
	XMP UNICOS 2:		1.4 Mbytes resident, with buffers
	Sun-3/50 SUNOS3.4:	0.6 Mbytes resident (no disk drives)

These numbers are determined from kernel printf()s at boot time, or TOP,
and are not estimates.  I would say that the XMP UNICOS kernel stacks up
pretty well against it's slower brethren.  I would also like to
mention that COS (both 115 and 116) uses more memory than UNICOS.  I
don't have the figures handy (and rebooting the Cray just to get them
wouldn't make me very popular), but I remember it as being several 100
Kwds more.  Still not a "HUGE" difference.

Having said all that, I don't think that operating system size is enormously
important, as long as it isn't "too big".  One thing that I'm sure we can all
agree on is that XMPs don't have enough main memory, considering their
speed.  It's a lot like the old PDP-11/45 with 256Kbytes of bi-polar memory:
very fast (in it's day, for the price), but only enough memory for 2 or 3
sizeable compute-bound programs.

I'd also like to note that if your workload is entirely batch, then
there may not be any strong reason to run UNIX on your Cray.  However,
let me tell you that using UNIX on a Cray is pretty heady stuff.  Being
able to open an "XMP" window or a "Cray-2" window on my Sun, and have
the same Shells, screen editors, compilers, source code tools, TCP
networking, etc.etc.etc. as I have on my Suns, SGIs, VAXen, Goulds, and
Alliants is worth a lot to me.  Being able to "RSH" an image processing
command over to a Cray without having to put the files over on the Cray
first, or having to submit some silly batch job, is a really big win.
Consider doing an operation like this in any other environment;  only in
an all VAX/VMS+DECNET software environment do you stand a good chance --
but that isn't multi-vendor (or nearly as fast):

	pixinterp2x -s512  < image.pix |  \
	rsh Cray.arpa "pixfilter -s512 -lo" |  \
	rsh Alliant.arpa "pixmerge -n 63/0/127 -- - background.pix |  \
	rsh Vax.arpa "pixrot -r -i 1024 1024 | pix-fb -h"

Which roughly says:  grab an image on my local machine, perform
bi-linear interpolation locally, then send it to the Cray for
low-pass filtering, then send it to the Alliant for compositing
with my favorite background, then send it to a trusty VAX to
(a) rotate the image 180 degrees and (b) display it on the frame buffer.

This, by the way, is not an invented shell command.  These programs
really exist, and are commonly used in this way. Note how the image only
"touches" a disk drive once in the whole procedure. Perhaps less
important for this example, because the image at most stages is only 3
Mbytes, but this becomes more important when manipulating 400Mbyte image
data from NASA (which we have occasion to do, in exactly this way).
(By the way, this software is available at no cost, E-mail me for
details). Note that these procedures may take significant CPU time, but
you can be certain that I'll be paying careful attention to the screen
as my results arrive.  This >>could<< be done in batch, but then I
wouldn't have the opportunity to type ^C (SIGINT) in the middle if
something is going wrong.  Think of how much Cray time (and people
time!!) that might wind up saving you.

>> besides, are you really going to run a program that 
>> takes 5 hours (of CPU time) to run interactively?

For some programs, the answer is clearly "no".  Perhaps, if your program
was generating graphics describing it's progress "on the fly", you might
gain new insight into the problem that you are studying.  If you think
hard enough, almost anything can benefit from graphics.  Even watching
100000x100000 matrices being inverted is likely to improve your
understanding. You might gain a new understanding of the convergence
properties of your algorithm if you could sample every Nth iteration as
a picture on your screen.  Think about it.

In closing, I'd like to summarize by observing that COS isn't a "bad"
system, it just lacks lots of things that have come to be important.
Good interactivity, network access, and portable software are not easy
to do without in the fast-track business of "high-tech".

	Best,
	 -Mike Muuss
	  Ballistic Research Lab

PostScripts:

1)  Yes, I know what UT-2D is.  The result of Texan ingenuity striving
desperately to avoid COS's forefathers.

>> Scholars who study dinosaurs say there were some smart dinosaurs and lots
>> of stupid dinosaurs.  Those smart dinosaurs came along early, but in the
>> survival wars, please note, the stupid dinosaurs won.

2)  I'm sorry, I hadn't noticed that any dinosaurs really "won". Abusing
one of my favorite quotes seems appropriate: "Using TSO is like kicking
a dead dinosaur up the beach". And, speaking of saurians and TSO, have
you taken careful notice of of IBM's announcement about AIX (UNIX)?  It
looks like after many years, IBM may finally be offering their customers
some software that is as classy as their hardware.  We are flying out a
team next week to investigate.  (Proving that I too can ramble).