Path: utzoo!mnetor!uunet!lll-winken!lll-crg.llnl.gov!brooks From: brooks@lll-crg.llnl.gov (Eugene D. Brooks III) Newsgroups: comp.arch Subject: Re: Memory bank conflicts Message-ID: <4773@lll-winken.llnl.gov> Date: 11 Mar 88 18:15:16 GMT References: <7690@pur-ee.UUCP> <3300021@uiucdcsm> <4712@lll-winken.llnl.gov> <12514@sgi.SGI.COM> Sender: usenet@lll-winken.llnl.gov Reply-To: brooks@lll-crg.llnl.gov.UUCP (Eugene D. Brooks III) Organization: Lawrence Livermore National Laboratory Lines: 23 In article <12514@sgi.SGI.COM> bron@olympus.SGI.COM (Bron C. Nelson) writes: >The issue regarding memory bank conflicts usually has to do (no >surprise) with array acesses. I seem to recall that someone did a >study of array indexing (either LLNL or Cray I believe) and concluded >that for their test cases, about 60% of array accesses had a stride >of 1 (i.e. the code stepped sequentially through the array in memory >order), about 20% had stride 2, and about 20% "other". WARNING: this >is off the top of my head; probably mis-remembered (the stride 2 My best informant indicates 80% stride 1, 20% "other". For two D arrays people actively pad to get stride 1 one way and an "odd" stride the other so array refs in both dimension go at the full clip. Of the 20% "other", about 3% is estimated to be random gather. There are heavily used codes, however, which use random gather at the 25% level. Its just that this might be one code out of 10 or 20. Random gather "never" runs at the full clip on any machine due to "random conflicts", and very few machines handle random gather without a "several clock penalty" per vector element. Some machines, the names left unmentioned for my own personal protection, are quite effectively castrated by random gather in performance terms (even though the random gather is supported in hardware). The data mentioned above are "estimates" from knowledgeable sources, such detailed statistics are very difficult to obtain.