Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10.2 9/18/84; site lanl.ARPA Path: utzoo!watmath!clyde!bonnie!akgua!gatech!seismo!cmcl2!lanl!jlg From: jlg@lanl.ARPA Newsgroups: net.arch Subject: Re: 128Mb - I give up! Message-ID: <34416@lanl.ARPA> Date: Fri, 6-Dec-85 20:57:40 EST Article-I.D.: lanl.34416 Posted: Fri Dec 6 20:57:40 1985 Date-Received: Sun, 8-Dec-85 03:26:31 EST References: <285@frog.UUCP> <34249@lanl.ARPA> <696@unc.unc.UUCP> Reply-To: jlg@a.UUCP (Jim Giles) Organization: Los Alamos National Laboratory Lines: 49 > > Actually, I can't remember a time when the fastest machines on the > > market had virtual memory. Page swapping can, at best, improve > > throughput (usually not). Page swapping is almost guaranteed to degrade > > turn-around of individual tasks. > > The Cyber 203 & 205 which can outperform the Crays on most good > days do have virtual memory. I guess by 'good days' you mean those days when the only code you run is for very long vectors in highly vectorized code or is code that has been VERY carefully optimized for Cybers. I've seen a lot of benchmarks of both machines (I work with several different vintages of Crays on a daily basis - and most of the people I work with are interested in only one thing - SPEED). The Cyber does very well on specific kinds of problems involving long vectors. It also does reasonably on codes that have been carefully tailored for Cyber machines (ie. standard benchmark sets like the 'Livermore Loops'). The Cyber does consistently worse than Crays for short vectors, scaler code and code that hasn't been recoded for the specific machine - this includes most production codes at most of the major labs. The problem is that vector setup time on Cybers is enormous. You are right that the asymtotic speed of Cybers is faster than the older Crays, but that is only for brief sputs of pure vector code. This extreme vector setup time means that short vectors don't run very fast at all (ie. multiplying two 3x3 matrices is not very efficient on Cybers). Long vectors, where the pipeline time dominates, run very fast indeed. Real production codes have a heterogenous mix of vector lengths, as well as a lot of inherently scaler code for which the Cyber doesn't compete well at all. Meanwhile, vector setup time on the Cray is always short and predictable even for data that is not contiguous and (with the X/MP) even for gather scatters. This means that short vectors (which constitute a large proportion of many codes) run nearly as efficiently as long vectors. Generally, for most codes with heterogenous mixes of vector lengths, older Crays run slightly faster than Cybers - new Crays (X/MPs, Cray II) run much faster. The virtual memory is actually a large part of the speed degradation. In order to run vectors efficiently, the vector must not span page boundaries. This means that each new vector operation must have it's data moved around in memory so that page faults don't occur from the vector unit. If Cybers had very large central memory, instead of virtual memory it would almost certainly be a faster machine (and would therefore compete better than it has). J. Giles Los Alamos