Path: utzoo!utgpu!jarvis.csri.toronto.edu!rutgers!cs.utexas.edu!rice!sun-spots-request From: djones@awesome.Berkeley.EDU Newsgroups: comp.sys.sun Subject: SPARC divide - really really slow! Keywords: Miscellaneous Message-ID: <4075@brazos.Rice.edu> Date: 27 Dec 89 04:45:36 GMT Sender: root@rice.edu Organization: Sun-Spots Lines: 45 Approved: Sun-Spots@rice.edu X-Sun-Spots-Digest: Volume 9, Issue 1, message 5 of 6 I was faced with a program which ran as fast on a SUN 3/60 as it did on a SUN 4/280, when there should a factor of 2-3 difference if you believe the MIPS rating. Using profiling "cc -pg", it became evident that the source is the SPARC divide instruction -- I gather there is none. This is, of course, part of the RISC strategy. I'm still just a bit surprised that SUN/SPARC hasn't figured out a way to get integer divisions done a little faster on a SUN 4/280 than on a SUN 3/60! I was amused to see some of the "functions" that gprof found using up all my CPU time. I gather the code checks to see if the numbers are "not_really_big", or "not_too_big" to do the division (ahem) faster. So are we stuck with this poor multiply/divide performance in SPARC, or is this shortcoming being addressed? Heck, would it be faster to hand off these operations to the Floating Point chip? % cumulative self self total time seconds seconds calls ms/call ms/call name 13.9 106.87 36.19 divloop [4] 13.8 142.71 35.84 divloop [5] 3.3 162.28 8.69 divide [10] 3.3 170.84 8.56 not_really_big [11] 3.2 179.13 8.29 divide [12] 3.1 187.27 8.14 not_really_big [13] 3.0 203.11 7.71 end_regular_divide [15] 2.9 210.67 7.56 end_regular_divide [16] 2.5 223.95 6.50 9326374 0.00 0.00 .rem [18] 2.3 229.85 5.91 9326374 0.00 0.00 .div [20] 1.6 239.22 4.27 got_result [23] 1.4 242.88 3.66 got_result [24] 0.6 248.88 1.69 do_regular_divide [25] 0.6 250.43 1.55 do_regular_divide [26] 0.5 251.65 1.22 end_single_divloop [27] 0.5 254.04 1.19 end_single_divloop [29] 0.2 256.09 0.62 4 155.02 155.02 .urem [33] 0.1 257.94 0.38 do_single_div [36] 0.1 258.32 0.38 do_single_div [37] 0.1 259.03 0.36 5 71.01 71.01 .udiv [39] 0.1 259.38 0.35 not_too_big [40] 0.1 259.64 0.27 not_too_big [41] 0.1 260.28 0.17 single_divloop [45] 0.0 260.35 0.07 single_divloop [48] 0.0 260.51 0.01 zero_divide [55]