Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!wuarchive!zaphod.mps.ohio-state.edu!magnus.acs.ohio-state.edu!csn!ncar!gatech!usenet.ins.cwru.edu!agate!stanford.edu!rutgers!modus!gear!cadlab!martelli From: martelli@cadlab.sublink.ORG (Alex Martelli) Newsgroups: comp.arch Subject: Re: RISC vs. CISC -- SPECmarks Message-ID: <820@cadlab.sublink.ORG> Date: 5 May 91 10:08:56 GMT References: <3423@charon.cwi.nl> <11602@mentor.cc.purdue.edu> <1991Apr30.163153.18568@midway.uchicago.edu> <1991May2.162909.9165@news.arc.nasa.gov> Organization: CAD.LAB, Bologna, Italia Lines: 18 mccalpin@perelandra.cms.udel.edu (John D. McCalpin) writes: ... :theory, but all too often I find that the various machine's :optimizers require *slightly* different code --- there is no one piece :of code (even a nice block-mode version) that optimizes well on a :broad range of scalar platforms.... Matrix multiply is a good example Yes, I do agree with that - which speaks well for Dan Bernstein's idea of having a language construct to say to the compiler: here are 2/3/N different implementations of the SAME programming semantics, now please choose the one that's fastest on THIS machine! This way we would still have to do the hand-tweaking initially, but once ouur code performs well o, say, half a dozen platforms, we stand a far better chance to be able to just compile and run fast on any new platform... and this holds not only for numerical codes, but for much bread and butter stuff as well, e.g. an explicit 'strcpy(a,b);' versus 'while(*a++=*b++);' where some machines and compilers might be able to inline the call, and others might not, just to give a trivial example.