Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10.1 6/24/83; site ecsvax.UUCP Path: utzoo!watmath!clyde!burl!ulysses!allegra!bellcore!decvax!mcnc!ecsvax!ctk From: ctk@ecsvax.UUCP (Tim Kelley) Newsgroups: net.sources Subject: 8087 linear algebra routines (to be called from C) Message-ID: <684@ecsvax.UUCP> Date: Sat, 9-Feb-85 12:12:09 EST Article-I.D.: ecsvax.684 Posted: Sat Feb 9 12:12:09 1985 Date-Received: Tue, 12-Feb-85 05:40:20 EST Organization: NCSU Dept. of mathematics Lines: 574 This shell script contains 8087 assmebler language functions for some linear algebra applications. The files in here are 1. 8087.doc -- a small explanation. 2. test.c -- a C program to test for speed etc. 3. qsol.c -- a C function to solve linear systems by QR decomposition includes the decomposition routines. Illustrates the use of the assmebler functions. 4. ssdaxpy.asm, ssxdot.asm, dnrm2.asm -- the assembler functions. ------------ cut here --------- : Run this shell script with "sh" not "csh" PATH=:/bin:/usr/bin:/usr/ucb export PATH all=FALSE if [ $1x = -ax ]; then all=TRUE fi /bin/echo 'Extracting 8087.doc' sed 's/^X//' <<'//go.sysin dd *' >8087.doc These routines are assembler versions of some of the basic linear algebra subroutines (BLAS) that are used in packages like LINPACK [1]. The routines are designed to be called by the CI-C86 C compiler and are roughly twice as fast as routines coded in C. The routines and what they do are summarized in the comments at the beginning of the each source file. As an example of their use I'm including a QR factorization routine (written in C) and a function that calls it to solve a system of linear equations. The QR function and the supporting routines are taken from the algorithms in [2] which are based on [3]. To use these rou- tines in nonroutine ways you should look at the references. Ref. [1] is particularly useful for those who want to roll their own linear algebra routines. References 1. LINPACK Users Guide, Dongarra, Bunch, Moler, and Stewart, SIAM, 1979. 2. Numerical Methods for Unconstrained Optimization and Non- linear Equations, Dennis and Schnabel, Prentice Hall, 1983. 3. Introduction to Matrix Computation, Stewart, Academic Press, 1973. Another good general reference on these topics is 4. Matrix Computations, Golub and Van Loan, Johns Hopkins, 1984 The 8087 routines should be assembled using the IBM assembler version 2.0 or later and the .h files from the CI compiler disks should be on the same disk. I would appreci- ate hearing about any bugs that you find in these routines. The CI documentation (for v. 2.1) says that the pro- cedure for returning doubles and floats may change. Func- tions that need to return doubles do this through pointers to avoid this. Currently doubles are returned in the ax,bx,cx,and dx registers. Other compilers return doubles on the top of the 8087 stack. I play it safe. These functions work in the **small model** only. Conversion to the large model should be straightforward but very boring. If anyone does this please send me the code before I have to do it myself (in about 6 mos.). C.T. Kelley Dept. of Math. N.C. State U. Box 8205 Raleigh, N.C. 27695-8205 919-737-3300 decvax!mcnc!ecsvax!ctk //go.sysin dd * made=TRUE if [ $made = TRUE ]; then /bin/chmod 644 8087.doc /bin/echo -n ' '; /bin/ls -ld 8087.doc fi /bin/echo 'Extracting test.c' sed 's/^X//' <<'//go.sysin dd *' >test.c #include "stdio.h" #define fabs(A) (((A)>0.0) ? (A) : -(A)) #define sign(x) ((x == 0.0) ? 1.0 : x/fabs(x)) #define MAXDIM 81 X/* MAXDIM = 81 is a little dangerous. If you are storing much else use 41.*/ extern double cos(); X/* This program is a test for the program qsol. The program creates an orthogonal matrix, a, and a right hand side b. The solution to Ax=b is known exactly and subtracted from the x given by qsol. The function fabs() is given by a macro to speed things up. */ main() { double a[MAXDIM][MAXDIM],b[MAXDIM],x[MAXDIM],z,y,y1; double u[MAXDIM],norm; double *g[MAXDIM],b1[MAXDIM]; int i,j,n1,j1,n; /* n = dimension of problem <= MAXDIM. */ n=50; /* With n=50 this code runs in 11.20 sec on an AT. */ n1=n-1; norm=0.0; /* Create the matrix and the right hand side. */ for(i=0;iqsol.c #include "stdio.h" #define fabs(A) (((A)>0.0) ? (A) : -(A)) #define sign(x) ((x == 0.0) ? 1.0 : x/fabs(x)) #define MAXDIM 81 X/* MAXDIM = 81 is a little dangerous. If you are storing much else use 41.*/ X/* qsol.c -- Solve linear systems by QR factorization Functions in this file: qsol() qrdecm() qrsolv() rsolv() m is an array of pointers to rows of matrix, b is right hand side, x is solution, n is dimension of the problem. Indices vary from 0 to n-1, calling program must adjust to this. The algorithm is the one given in G.W. Stewart's book, Introduction to Matrix Computations. The code is based on the algorithms given in Dennis and Schnabel's book, Numerical Methods for Unconstrained Optimization and Nonlinear Equations. If you are unfamiliar with these ideas you should consult one or both of these books. */ qsol(m,b1,x,n) double *m[],b1[],x[]; int n; X/* This is the driver function for the QR decomposition routines and the solvers. In using these you should be sure to pass an array of pointers to the rows of the matrix rather than the matrix itself. That is the setup should include something like for(i=0;i=0;i--) { ssxdot(m[i]+i+1,b+i+1,n-i-1,&sum,8,8); b[i]-=sum; b[i]/=m2[i]; } } //go.sysin dd * made=TRUE if [ $made = TRUE ]; then /bin/chmod 644 qsol.c /bin/echo -n ' '; /bin/ls -ld qsol.c fi /bin/echo 'Extracting ssxdot.asm' sed 's/^X//' <<'//go.sysin dd *' >ssxdot.asm ; ; See the comment on header files in ssdaxpy.asm. ; include model.h INCLUDE PROLOGUE.H public ssxdot ; 8087 dot product ; ssxdot () ; X.8087 ssxdot proc near push bp mov bp,sp ; Current syntax ssxdot(&f,&g,n,&h,stride1,stride2) -- > dot product stored in h ; &f in di, &g in bx , n in cx. This function stores the dot product rather ; than returning it because different C compilers return doubles in different ; ways. ; ;This is equivalent to the following C function. ; ; ssxdot(v1,v2,n,dptr,stride1,stride2) ; double *v1,*v2,*dptr; ; int n,stride1,stride2; ; { ; int i; ; *dptr=0.0; ; for(i=0;issdaxpy.asm ; ; The header files model.h, prologue.h, epilogue.h are specific to the ; Computer Innovations CI-C86 compiler. Your compiler is probably different. ; include model.h INCLUDE PROLOGUE.H public ssdaxpy ; 8087 main loop for lsol ; ssdaxpy () ; X.8087 ssdaxpy proc near push bp mov bp,sp ; Current syntax ssdaxpy(x,y,n,a,stride1,stride2) y=y+ax ; ; This routine is equivalent to the following C funciton. ; ; ssdaxpy(x,y,n,a,stride1,stride2) ; double *x,*y,a; ; int n,stride1,stride2; ; { ; int i; ; for(i=0;idnrm2.asm ; ; See the comments on header files in ssdaxpy.asm. ; include model.h INCLUDE PROLOGUE.H public dnrm2 ; 8087 dot product ; dnrm2 () ; X.8087 X.286C dnrm2 proc near push bp mov bp,sp ; Current syntax dnrm2(&x,n,&nrm,stride) -- > l2 norm of x stored in h ; &x in s , n in cx , &nrm in di ; ; This is equivalent to the following C function. ; dnrm2(xptr,n,normptr,stride) ; double *xptr,*normptr; ; int n,stride; ; { ; int i; ; *xptr=0.0; ; for(i=0;i