Path: utzoo!attcan!uunet!lll-winken!lll-lcc!ames!hc!lanl!beta!jlg
From: jlg@beta.lanl.gov (Jim Giles)
Newsgroups: comp.arch
Subject: Re: Cray & Amdahl (Really: VM on vector processors) (Was: ...)
Message-ID: <20839@beta.lanl.gov>
Date: 22 Jul 88 20:28:07 GMT
References: <4232@cbmvax.UUCP> <76700035@p.cs.uiuc.edu> <12174@ames.arc.nasa.gov>
Organization: Los Alamos National Laboratory
Lines: 34

In article <12174@ames.arc.nasa.gov>, lamaster@ames.arc.nasa.gov (Hugh LaMaster) writes:
> The real debate, in my opinion, is not between virtual and non-virtual
> (virtual won a long time ago, in my opinion- Cray is an anachronism
> in this respect) [...]

In general, this is true.  Most machines and applications are better
off with VM.  But I think Cray did the right thing for his market.
Most buyers of supercomputers have long since figured out how to do
memory management in software.  And, since these users know the data
usage patterns of their programs, their software VM is MUCH more
efficient than existing hardware can supply.  It's no coincidence
that many 205 users still run with VM turned off - their codes run
faster that way!

The problem is that hardware VM isn't flexible enough to deal with a
large variety of data usage patterns.  As a result, most VM machines
just do some variant of demand paging.  This is exactly the WRONG data
usage model for most large-scale scientific codes.  Providing more
sophisticated VM mechanisms would be more expensive and wouldn't really
help unless the user code is able to give 'hints' about the data usage
patterns.  But, if the user is required to give 'hints' in order to get
efficiency, he might as well do VM in software as he's always done
(figuring out what you need next is the hard part - actually reading it
in is easy).

Unless the hardware VM mechanism can look ahead far enough to avoid
page faults entirely (several hundred thousand instructions with the
current difference in disk and memory speed), it will never beat the
clever use of asynchronous I/O that more sophisticated users have been
doing for years.  Of course, if the hardware can somehow divine the
data usage patterns of the code automatically (a channeller perhaps? :-),
the it could maybe even beat the user's software VM.

J. Giles
Los Alamos