Xref: utzoo comp.graphics:4116 comp.windows.x:7296 Path: utzoo!attcan!uunet!lll-winken!ames!ucsd!orion.cf.uci.edu!uci-ics!venera.isi.edu!raveling From: raveling@vaxb.isi.edu (Paul Raveling) Newsgroups: comp.graphics,comp.windows.x Subject: Re: Luminance from RGB Message-ID: <7266@venera.isi.edu> Date: 14 Jan 89 03:37:39 GMT References: <572@midgard.Midgard.MN.ORG> <10322@well.UUCP> Sender: news@venera.isi.edu Reply-To: raveling@vaxb.isi.edu (Paul Raveling) Organization: USC-Information Sciences Institute Lines: 53 In article <10322@well.UUCP> Jef Poskanzer writes: >I wrote a quick test program to try out various approximations. It runs >five million conversions. On a Sun 3/260, the timings are: > > float: 223.0 > int: 35.4 > table: 31.6 > >I have appended the program, in case anyone wants to run it on a different >architecture or try different approximations. Just below are some results from an HP 9000/350. I added two runs: One was with "shifty" logic defined by... #ifdef SHIFTY j = ( r+r + (g<<2)+g + b ) >> 3; #endif The other was a "no logic" run, with nothing defined to get an overhead calibration (how much time the loop logic and rgb updating used). The "Less Overhead" column below subtracts this to get a direct comparison of timing for the math only. Test Raw Timing Less Overhead ---- ---------- ------------- float 220.0 201.2 int 37.6 18.8 table 34.9 16.1 shifty 28.9 10.1 overhead 18.8 0 This isn't entirely what I anticipated: The "int" version, (j = ( r * 77 + g * 150 + b * 29 ) >> 8;), appeared to be faster than expected. I checked further and found that on this one the compiler decomposed all three multiplies into shifts, adds, and a subtract. Also, the table version seemed too slow. It turned out that the compiler generated some remarkably crummy code. ALL data except the tables were kept on the stack -- none in registers -- and the subscript address computations appeared to be distinctly suboptimal. Next, maybe tomorrow, I'll try the same stuff with some hand coded assembly language. It should be easy to beat the compiler by LOTS. --------------------- Paul Raveling Raveling@vaxb.isi.edu