Path: utzoo!news-server.csri.toronto.edu!cs.utexas.edu!asuvax!ncar!csn!magnus.ircc.ohio-state.edu!zaphod.mps.ohio-state.edu!caen!ox.com!emv From: schreiber@schreiber.asd.sgi.com (Olivier Schreiber) Newsgroups: comp.archives Subject: [sgi] Optimized blas1,2,3 available anonymous ftp Keywords: blas lapack Message-ID: <1991Mar6.013951.15828@ox.com> Date: 6 Mar 91 01:39:51 GMT References: <1991Mar1.192353.17038@odin.corp.sgi.com> Sender: emv@ox.com (Edward Vielmetti) Reply-To: schreiber@schreiber.asd.sgi.com (Olivier Schreiber) Followup-To: comp.sys.sgi Organization: Silicon Graphics, Inc. Mountain View, CA Lines: 90 Approved: emv@ox.com (Edward Vielmetti) X-Original-Newsgroups: comp.sys.sgi Archive-name: math/linear-algebra/blas/1991-03-01 Archive-directory: sgi.com:/pub/lib/libblas/ [192.48.153.1] Original-posting-by: schreiber@schreiber.asd.sgi.com (Olivier Schreiber) Original-subject: Optimized blas1,2,3 available anonymous ftp Reposted-by: emv@ox.com (Edward Vielmetti) % ftp sgi.com or % ftp 192.48.153.1 Connected to 192.48.153.1. 220 SGI.COM FTP server (version 5.60 IRIX 02/25/91 16:25) ready. Name (192.48.153.1:guest): anonymous 331 Guest login ok, type your name as password. Password: 230 Guest login ok, access restrictions apply. Remote system type is UNIX. Using binary mode to transfer files. ftp> cd /pub/lib/libblas ftp> pwd 257 "/pub/lib/libblas" is current directory. ftp> ls 200 PORT command successful. 150 Opening ASCII mode data connection for '/bin/ls'. total 1595 -rw-r--r-- 1 ftp guest 2735 Mar 1 11:04 README -rw-r--r-- 1 ftp guest 813228 Feb 26 10:20 libblas.a.Z 226 Transfer complete. BLAS : Basic Linear Algebra Subroutine is the library used as a toolkit for the LAPACK project. LAPACK : "Linear Algebra Package" is a project originated by Jack Dongarra from Oak Ridge National Lab. This project supported by National Science Fondation (NSF) will put together a new set of linear algebra functions, supposed to supplant both LINPACK and EISPACK packages. To achieve maximum efficiency across all types of hardware, the LAPACK routines are based on matrix-matrix BLAS 3 routines(e.g. DGEMM). This implementation , is much more performant than anything based on vector-vector BLAS 1 routines(e.g. DAXPY), or even matrix-vector BLAS 2 routines (e.g. DGEMV). release time for LAPACK : April 1991 ------------------------------------------------------------------------------ WARNING --- This current version is an "alpha-version" ! - The real (prefixe S) and double precision (prefixe D) of BLAS2 and BLAS3 had been hand optimized-parallelized (in Fortran). - The only complex routines hand optimized/parallelized are CGEMM and ZGEMM (in Fortran). - The BLAS1 routines are not parallelized, the most important are hand- coded in Assembly language. - Although these routines had been intensively tested, it is possible that a few bugs are left. ------------------------------------------------------------------------------ to load on your machine : uncompress libblas.a ------------------------------------------------------------------------------ Example of performance : dgemm double precision > 60 Mflops on a 4D/380 ------------------------------------------------------------------------------ known problem: Performance may be reduced when arrays are perfectly aligned to cache-size boundaries. This may happen when "leading" dimensions are powers of two. For example, it is better to declare a matrice (1025,1024), rather than (1024,1024). ------------------------------------------------------------------------------ Send comments/complains/bug reports to : Jean-Pierre Panziera Silicon Graphics fax : (415)962-9601 E-Mail: jpp@corp.sgi.com -- Olivier Schreiber Technical Marketing schreiber@sgi.com (415)335 7353 MS/7L580 Silicon Graphics Inc., 2011 North Shoreline Blvd. Mountain View, Ca 94039-7311