Path: utzoo!utgpu!news-server.csri.toronto.edu!bonnie.concordia.ca!thunder.mcrcim.mcgill.edu!snorkelwacker.mit.edu!usc!wuarchive!udel!nigel.ee.udel.edu!mccalpin From: mccalpin@perelandra.cms.udel.edu (John D. McCalpin) Newsgroups: comp.unix.cray Subject: Re: Summary for Protection in Cray Message-ID: Date: 11 Jan 91 00:17:50 GMT References: <1991Jan10.230715.404@agate.berkeley.edu> Sender: usenet@ee.udel.edu Organization: College of Marine Studies, U. Del. Lines: 60 Nntp-Posting-Host: perelandra.cms.udel.edu In-reply-to: chiueh@sprite.Berkeley.EDU's message of 10 Jan 91 23:07:15 GMT > On 10 Jan 91 23:07:15 GMT,chiueh@sprite.Berkeley.EDU (Tzi-cker Chiueh) said: chiueh> So why does Cray get rid of virtual memory altogether ? Or chiueh> does anybody know how much performance improvement can we gain chiueh> from getting rid of VM kent> The number of cycles needed to transfer the first word from kent> memory to a register is one of the most critical timings in kent> the supercomputer. Cray can do this in 17 cycles. An SX3 kent> requires 70 cycles. An ETA 10 needed hundreds of cycles. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ This is/was a common misconception. The ETA-10 actually attained vector startup times of as short as about 23 cycles if the first pages of all the operands were in real memory. This includes both the time required to get the first element from memory into the pipe as well as 3-4 more cycles to get the pipe filled. Results were then available from the first operation on about the 24th cycle (if the result was to be immediately reused) or about the 30th cycle if the result had to go all the way back to memory. The startup time varied between about 16 and 32 cycles depending on whether the memory banks were aligned and whether or not operations were being chained (in which case there were two pipes to fill, not just one). On a number of test loops, the ETA-10 was significantly *faster* on short vector operations than the 8.5ns Cray X/MP. This did not typically mean that short-vector *application codes* ran faster on the ETA-10, though.... :-( kent> Adding demand paging will significantly lengthen this cycle kent> time. If you can add demand paging without adding cycles to kent> this memory fetch time, then I am sure Cray will make you a kent> rich person. ETA/CDC did it, and it certainly did not make them rich! I believe that the two Cray companies simply decided that the benefits of VM were not worth the hassle. So far the market has proven them right. kent> Supercomputers with virtual memories have been tried. The CDC kent> 205 and the ETA10 are examples. When these machines ran codes kent> where the problem size exceed the RAM size (paging), they ran kent> 10 time slower than when paging did not occur. This is hardly surprising. Anyone with any experience at all realizes that VM is to be used to make a small class of jobs much easier to code by letting the hardware handle the large address space -- *not* to just run larger-than-real-memory jobs. It should be noted that it is possible to write jobs that are larger than real memory but which do not slow down significantly in a VM system. One application was a straightforward LU-decomposition of a 2000x2000 dense matrix. Only about 2 Million words were available to the user on the machine, which required 4 Million words of virtual space. By using a block-mode algorithm and the best ETA UNIX swapping code, our CDC applications specialist was able to get nearly full performance on this problem. The advantage relative to the Cray was that on the ETA it could be done in standard Fortran, while the Cray would have required explicit I/O. -- John D. McCalpin mccalpin@perelandra.cms.udel.edu Assistant Professor mccalpin@brahms.udel.edu College of Marine Studies, U. Del. J.MCCALPIN/OMNET