Path: utzoo!attcan!uunet!snorkelwacker!bloom-beacon!EXPO.LCS.MIT.EDU!keith From: keith@EXPO.LCS.MIT.EDU (Keith Packard) Newsgroups: comp.windows.x Subject: Re: XCopyArea pixmap to pixmap: why is it so slow ? Message-ID: <8912022028.AA01518@xenon.lcs.mit.edu> Date: 2 Dec 89 20:28:16 GMT References: <8912021948.AA02569@expo.lcs.mit.edu> Sender: root@athena.mit.edu (Wizard A. Root) Organization: The Internet Lines: 42 > I have done some perfomance testing of XCopyArea and I have found that on > most of the servers that we use (Vax GPX, hp, IBM Ps2) that it is around > 10 times slower to do a XCopyArea from pixmap to pixmap then to do XCopyArea > from window to window. In each of the given examples, the display memory is controlled by special "graphics decellerators", while off-screen memory is controlled using the cpu alone. Because most benchmarking programs only measure performance for on-screen graphics, vendors typically optimize the code/hardware which draws there, and leave the off-screen rendering to some first-year engineer. This causes a large discrepency between on-screen and off-screen graphics performance, which can easily be rectified by writing more intelligent off-screen graphics code. > So far I found that only the > DecStation (mips) has a copyarea function that is a fast on pixmaps than > on windows. This is because on-screen rendering is the same as off-screen rendering; the cpu does all of the work in both cases and simply writes to either main memory or the memory mapped frame buffer. This means that tuning on-screen performance effectively tunes the off-screen cases as well. A more substantial advantage of this latter method is the performance you will get in moving bits between the screen and off-screen pixmaps. The performance you see for either on-screen/on-screen or off-screen/off-screen bitBlt will be the same as on-screen/off-screen. And all with only one copy of the bitblt code. The disadvantage that this typically has is that special graphics hardware is frequently connected to a display memory system that provides additional bandwidth (interleaved memory, page-mode access, or wider-than-32-bit access). This special memory system allows the on-screen/on-screen case to work faster than would otherwise be possible. Note that the magic graphics hardware has little effect on this; the typical CPU can easily copy bits around as fast as the memory system can take them, it just doesn't usually have a fast enough path to the display memory. Keith Packard MIT X Consortium