Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10.2 9/18/84; site calgary.UUCP Path: utzoo!watmath!clyde!cbosgd!ihnp4!alberta!calgary!radford From: radford@calgary.UUCP (Radford Neal) Newsgroups: net.lang.c,net.arch Subject: Time penalty for non-alignment on VAX/780 Message-ID: <330@calgary.UUCP> Date: Wed, 20-Mar-85 16:40:16 EST Article-I.D.: calgary.330 Posted: Wed Mar 20 16:40:16 1985 Date-Received: Sun, 24-Mar-85 05:16:58 EST References: <9251@brl-tgr.ARPA> <317@cmu-cs-k.ARPA> <9277@brl-tgr.ARPA> Organization: University of Calgary, Calgary, Alberta Lines: 66 Xref: watmath net.lang.c:4850 net.arch:1022 RE: The discussion of whether C ought to pad structures to align data and the subsequent discussion of how much this gets you on a VAX. There's nothing like actual data on a question like this. I ran the following quick test program on a VAX 11/780 (Berkeley 4.2 C compiler): #include main(argc,argv) int argc; char **argv; { int a[2]; register int *p; int n; int o; int rw; register int x; n = atoi(*++argv); o = atoi(*++argv); rw = **++argv; p = (int*)((int)&a[0]+o); if (rw=='r') { while (n>0) { *p = 0; *p = 0; *p = 0; *p = 0; *p = 0; *p = 0; *p = 0; *p = 0; *p = 0; *p = 0; n -= 1; } } else { while (n>0) { x = *p; x = *p; x = *p; x = *p; x = *p; x = *p; x = *p; x = *p; x = *p; x = *p; n -= 1; } } } The results are as follows: % time aligntime 100000 0 r 2.1 real 1.8 user 0.0 sys % time aligntime 100000 1 r 5.3 real 3.6 user 0.0 sys % time aligntime 100000 2 r 5.1 real 3.6 user 0.0 sys % time aligntime 100000 3 r 5.7 real 3.7 user 0.1 sys % time aligntime 100000 4 r 1.9 real 1.5 user 0.0 sys % time aligntime 100000 0 w 1.3 real 1.1 user 0.0 sys % time aligntime 100000 1 w 3.2 real 2.7 user 0.0 sys % time aligntime 100000 2 w 5.5 real 2.9 user 0.0 sys % time aligntime 100000 3 w 3.0 real 2.6 user 0.0 sys % time aligntime 100000 4 w 1.5 real 1.2 user 0.0 sys Conclusion: Alignment of longwords on a mod 4 boundary gets you better than a factor of two speed-up on both reads and writes. Alignment at a mod 8 boundary is not significant. This is a bit worse than I would expect. Does the 780's microcode fetch non-aligned longwords a byte at a time from cache? Note that the 64-bit data path to main memory is not relevant, only cache accesses, for this test. Radford Neal The University of Calgary