Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!zaphod.mps.ohio-state.edu!uakari.primate.wisc.edu!samsung!munnari.oz.au!brolga!bunyip.cc.uq.oz.au!marlin.jcu.edu.au!csrdh From: csrdh@marlin.jcu.edu.au (Rowan Hughes) Newsgroups: comp.sys.hp Subject: Re: A benchmark Message-ID: <1991May10.002450.19814@marlin.jcu.edu.au> Date: 10 May 91 00:24:50 GMT References: <1991May3.023705.5616@marlin.jcu.edu.au> <5570629@hpfcdc.HP.COM> Organization: James Cook University Lines: 32 In <5570629@hpfcdc.HP.COM> chuckc@hpfcdc.HP.COM (Chuck Cairns) writes: .... >Just wondered what your memory access pattern is on this particular >benchmark. If you access memory address X then X+1 do you take a large hop >in memory location? What I'm suspecting is that your benchmark may be No, the program is essentially vectorized with unit stride in most cases. The data would have streamed through the cpu exactly linearly. Hence its a good measure for streaming performance; see John McCalpin's stuff in comp.benchmarks. Data reuse in the cache would only have existed for the scalar parts ( < 15%). The vector size (for "A benchmark") is =< 90 (double prec). I think thats why the Cray time was a little slow, its N1/2 was the cause. >Since I don't know your particular benchmark these may not be applicable >ideas ... Is it possible to run the benchmark with "rows and columns" >exchanged ? I'd also be keen on knowing what happens when the 730 runs >totally from it's 256Kbyte data cache i.e. ... smaller arrays. HP software engineers ran (and looked at) this program (Sydney HP). They said the row/col major wasn't making any difference, so I'll have to take their world for it. I'll publish the times for the IBM540 in about 4 weeks. I doubt if a larger cache would help, certainly not for the vector parts anyway. NB. MOM is NOT a benchmark, its a real program >:) Cheers -- Rowan Hughes James Cook University Marine Modelling Unit Townsville, Australia. Dept. Civil and Systems Engineering csrdh@marlin.jcu.edu.au