Path: utzoo!utgpu!attcan!uunet!cbmvax!jesup From: jesup@cbmvax.UUCP (Randell Jesup) Newsgroups: comp.arch Subject: Re: Sw vs. Hw BitBlit. Keywords: BitBlit. Message-ID: <4366@cbmvax.UUCP> Date: 29 Jul 88 02:10:14 GMT References: <399@ma.diab.se> <1313@ucsfcca.ucsf.edu> <61783@sun.uucp> Reply-To: jesup@cbmvax.UUCP (Randell Jesup) Organization: Commodore Technology, West Chester, PA Lines: 89 In article <61783@sun.uucp> guy@gorodish.Sun.COM (Guy Harris) writes: > For applications in terminals, there are three cases of "bitblt" that > dominate: drawing characters, scrolling windows and window-window > operations such as exchanging off-screen data with the display. These > cases also cover the most common graphics operations on personal > computers. They were dealing with straight terminals and simple rectangular windows, being used as a mainly character-oriented interface to larger machines. A very different envirionment from today's microcomputers, such as the Amiga, Mac, etc. > decide how to draw the image. Because the characters are so small -- > drawing the letter 'a' touches 7 words of memory -- actually changing > the pixels in the destination bitmap is relatively unimportant. Our Characters are a somewhat special case, and are well worth special-casing in the code. The blitter does help a lot with proportional kerned fonts, less with monospaced fonts, and not at all with monospaced byte-multiple wide fonts aligned on byte boundaries. Unfortunately, this last case doesn't happen often, especially in a windowing envirionment. It can make editors that cover the screen several times faster. > The second common case of "bitblt" is scrolling a rectangular region > of a bitmap, usually the display. Since the word boundaries in the > scan lines of a bitmap are at the same place in each line, the speed of > scrolling depends primarily on the speed of the MC68000 instruction Once again, this is true in a text-based envirionment. In a WIMP envirionment, this is much less true. Block operations usually start on arbitrary boundaries, and tend to be inconvenient widths. > register long *p, *q; > *p++ = *q++; > > For typical rectangles, the edges, which must be handled with more > complicated code, do not dominate the performance. There is nothing > hardware can do to accelerate this loop except provide faster memory > access. If the display were accessed through a narrower or clumsier > interface, it would take longer to move the data. This is nowhere near as fast as the memory system can go nowadays, even given the slowest/cheapest DRAMS. For that loop, even unrolled, the cpu is being used at least 33% for instruction fetch, and even so the CPU only uses every other memory cycle. > The last common case is shuffling on- and off-screen rectangles. It > can be made fast by a simple observation: the off-screen bitmaps are > allocated by "balloc", which is given as argument the rectangle on the > display occupied by the data. This rectangle is assigned to "rect" in > the resulting "Bitmap". "balloc" can therefor allocate the bitmap so > that the word boundaries occur in the same places in the image as they > do in the display, reducing to the scrolling case the "bitblt" call > that copies the data. This is nowhere near the common case on machines like the Amiga. >I tend to believe Rob Pike and company when they say that "for real _bit_blit_ >operations such as moving a block 37 bits wide aligned starting at bit 17 in >the source position and starting at bit 29 in the destination position on a >machine with 32-bit registers and data paths" are not typical (at least in the >way they used Blits) except for character painting, where overhead above and >beyond the bit-pushing dominates. If you have evidence to indicate that this >is not the case, let's see it. You've said the operative clause: the way they used Blits. As I've said, blitter hardware can buy you linedraw and areafill as well relatively cheaply. These things are MUCH faster as part of blitter than as done by the CPU, up to 20x for linedraw. >If a BitBlt chip is reasonably cheap, and can do the whole job, it may be worth >it. Note that in the cases shown, you got at most a 3.5x speedup (scroll >screen horizontally). For vertical scrolling, you got only 1.18x; for randomly >drawing the letter 'a', you got only 1.23x; and for texturing a random 40x40 >square, you got 1.95x. How cheap does it have to be for that to be worth it? You get bigger wins in animation or multitasking evironments. A blitter is relatively cheap, if you already need video chips (of course, Commodore has chip design facilities, and uses custom chips for most things.) The blitter on the amiga is just a part of one of the graphics chips, maybe 1/4 of it. A factor of 2-4x can make a really amazing difference in percieved speed, especially if update operations go down to 1-frame time. Using my Sun-2 (no blitter, no color) is positively painful compared to my amiga, even though the amiga is running in 4-colors (in this case). -- Randell Jesup, Commodore Engineering {uunet|rutgers|allegra}!cbmvax!jesup