Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!usc!julius.cs.uiuc.edu!ux1.cso.uiuc.edu!ux1.cso.uiuc.edu!aglew From: aglew@crhc.uiuc.edu (Andy Glew) Newsgroups: comp.arch Subject: Alignment on RS/6000 Message-ID: Date: 20 Nov 90 05:29:26 GMT Sender: news@ux1.cso.uiuc.edu (News) Organization: Center for Reliable and High-Performance Computing University of Illinois at Urbana Champaign Lines: 55 Newsgroups: info.rs6000 From: U644401@hnykun11.bitnet (Wilfred Janssen) Subject: performance breakdown due to misalignment Original-To: Multiple recipients of list POWER-L Reply-To: POWER-L IBM RS/6000 POWER Family Organization: University of Illinois at Urbana Distribution: info Date: Mon, 19 Nov 90 14:20:57 MET We experienced a problem with our RS6000 model 320. Some FORTRAN applications ran at .2 Mflop/s, instead of at the usual 7.5 Mflop/s. We found that this was caused by a misalignment of double precision arrays. As an illustration we include a little test program, which calls the routine DROT (in library BLAS). When the arrays X and Y start at a double word boundaries (I0 = 0), the program requires 7.5 sec. Shift the begin address by one byte (I0 = 1) and the routine requires 398 sec! Wilfred Janssen Paul Wormer ----------------------------------------------------------------- C TEST TO ILLUSTRATE THE PENALTY OF MISALIGNMENT ON THE RS6000. PARAMETER (KILO=1024) CHARACTER*1 SPACE(0:16*KILO) C------------------------------------------------------------------ C THE FOLLOWING STATEMENT IS THE CULPRIT, CHANGE I0 TO 0 AND THE C PROGRAM RUNS 53 TIMES FASTER! C------------------------------------------------------------------ I0 = 1 I1 = I0 + 8*KILO CALL EXEC(SPACE(I0), SPACE(I1), KILO) END SUBROUTINE EXEC(X, Y, N) IMPLICIT DOUBLE PRECISION (A-H,O-Z) DIMENSION X(N), Y(N) DO 10 I=1,N X(I) = 1.D0 Y(I) = -1.D0 10 CONTINUE C = SQRT(0.5D0) S = -C DO 20 I=1,N*20 CALL DROT(N, X,1, Y,1, C,S) 20 CONTINUE END -- Andy Glew, a-glew@uiuc.edu [get ph nameserver from uxc.cso.uiuc.edu:net/qi]