Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!swrinde!zaphod.mps.ohio-state.edu!cis.ohio-state.edu!ucbvax!WATSON.IBM.COM!jbs From: jbs@WATSON.IBM.COM Newsgroups: comp.arch Subject: massive linpack Message-ID: <9106080254.AA12624@ucbvax.Berkeley.EDU> Date: 8 Jun 91 02:57:58 GMT Sender: daemon@ucbvax.BERKELEY.EDU Lines: 83 Richard Lethin asked: >At what size N for massive linpack does the accumulated numerical fuzz >swamp the precision available in a double precision number? I answered: > Very large. Typical relative error (in the vector norm sense) >in the solution is n*c*e, where n is matrix size, c is condition number >and e is machine epsilon (2**-52 for IEEE double). Therefore if you >want a solution accurate to 1 part in 1000 (10 bits) and your matrix >has condition number less than 1000 (10 bits) your n can be 2**30 or >10**9. This would require 10**18 storage and 10**27 ops to solve so >is well beyond current machines. Note once you have any bits in your >answer you can obtain more by iterative improvement an order N*2 pro- >cess. David Chase objects: I don't have the time to prove that you're wrong here, but I think that you are either wrong, or else being very misleading. In general, why should we believe you? What do you usually do? Goldberg and Hough are experts, as are the people that I took numerical analysis from too many years ago. Are you an expert? (I only know enough to know when I should be suspicious, but I'm suspicious.) First, you haven't said anything about worst case, which must be worried about. In addition, you haven't discussed a typical condition number. Second, I know that the NA people I once worked with were interested in using exact arithmetic for error analysis. (There's a paper by Boehm, O'Donnell, and Riggle in the 1986 Lisp & FP conference -- when I say "exact", I mean an answer guaranteed to be within epsilon of the true answer). This means that double precision isn't good enough for very accurate error analysis, though I don't know how they defined "very accurate". I suspect that the worst case bounds on error estimation were not satisfactory. Third, "LINPACK" is not a numerical analysis algorithm. There is a wide variety of algorithms in LINPACK, with varying stability and cost. If you don't specify the algorithm, how are we to know what you are talking about? Is it GE, GEPP, GEFP, QR, SVD, or Cholesky? Band matrix, or not? The worst case on GEPP (which is what I think you meant) is still pretty bad, though I've also heard (from an expert) that it is almost never seen. If (as I think you are saying) GEPP is so good, then why do we bother with more stable algorithms like QR and SVD? To clarify I was referring to the numerical properties of Gaussian elimination with partial pivoting (GEPP) when solving a large dense system of linear equations. This is what the TPP (towards peak performance) linpack benchmark does (on a system of size 1000). I presume the massive linpack benchmark is the same thing with larger matrices. Someone queried whether the quoted results were for problems which required pivoting which I agree is a good question. Anyone desiring further authority for my statements about GEPP should be able to find it in any good mathematics library. One reference is the book "Numerical Methods and Software" (David Kahaner, Cleve Moler and Stephen Nash, Prentice Hall, 1989). The accuracy of GEPP is discussed in chapter 3 (esp. p.66-p.67). It is well known there are certain pathological matrices for which GEPP does not perform well numerically. These matrices have the property that in the process of computing the factorization certain intermediate terms are formed which are much larger than the elements in the original matrix. These terms spoil the error analysis. If this does not happen then you are certain to be ok. This problem is almost never seen in practice. There- fore my statement about typical behavior. If you are using an appropriate numerical method the condition number generally will be determined by the stabilty of the physical system you are modeling and not by the number of points in your discret- ization. If your problem uses real inexact data the condition number best not be too large or your answer is going to be garbage regardless of how accurately you compute it. QR and SVD are used for solving least squares problems. Least squares problems can be formulated as linear equations (the normal equa- tions) but doing so unnecessarily increases the condition number. This is not a fault with GEPP but with the formulation. (This is discussed on p202-203 of the above referenced book.) As to credibility: I prefer to judge participants in the news- groups by the quality of their posts rather than by their "credentials" (or lack thereof). However for those readers who care about credentials I do have a few. I have a PhD in mathematics from MIT. Numerical anal- ysis is not my field of specialization but I have been exposed to it over the years, I have access to a mathematics library and I hopefully have enough mathematical sophistication to avoid gross misstatements. As to practical experience I have contributed to IBM's VS Fortran (ver- sion 2) intrinsic function library and to IBM's Engineering and Scienti- fic Subroutine Library (ESSL). James B. Shearer