Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!uwm.edu!spool.mu.edu!munnari.oz.au!metro!cluster!rex From: rex@cs.su.oz (Rex Di Bona) Newsgroups: comp.arch Subject: Re: Bitfield instructions--a good idea? Keywords: Graphics, Rendering Message-ID: <2325@cluster.cs.su.oz.au> Date: 22 Apr 91 06:25:48 GMT References: <1991Apr15.193425.3436@waikato.ac.nz> <10408@labtam.labtam.oz> Sender: news@cluster.cs.su.oz.au Reply-To: rex@cluster.cs.su.oz (Rex Di Bona) Organization: Basser Dept of Computer Science, University of Sydney, Australia Lines: 46 In article <10408@labtam.labtam.oz> graeme@labtam.labtam.oz (Graeme Gill) writes: > Consider that many applications are written to run on monochrome > devices, so they use single level bitmaps and fonts to encode graphic > objects. When rendered on a colour device, the 1->8 or 1->24 bit > expansion is used a great deal. This is true, but misleading. You do not want to convert 0 -> 00000000 and 1 -> 11111111, but 0-> some colour, and 1-> some other colour. Both of these colours are user selectable. This requires more hardware support than just a bit stretcher. > > when it does happen, you can use look up tables. to do your 1 -> 2 > > bits per pixel example, i would use a table of 256 entries, 16 bits > > each. > > In fast hardware the memory bandwidth is the limit. Using lookup tables > doubles the number of memory accesses, thereby halving the ultimate speed. > Hardware support for 1->power_of_two expansion is a desirable feature > in a graphics processor. > > > Graeme Gill > Labtam Australia You should be able to do all of the rendering in software, keeping the intermediate values in registers. This requires a better layout of the memory for the graphics 'screen'. You could use a lookup table for additional speed (as an example, if your video memory was 8 bits deep, and layed out so that 4 pixels occupied a 32 bit word you could construct a 16 element table (each 32 bits wide) with the appropriate spaces filled in. To set up the table would require 16 memory stores, but you rendering would be twice as fast (one read, one write as opposed to 4 writes) as the byte at a time method). You could remove the read by having all 16 values stored in registers, and do a branch to the apppropriate store, or other hardware nasties. The one instruction I keep wishing for when doing these sort of graphic operations is a rotate instruction. It can be used for line drawing, as a counter for the above routine (it wouldn't require reloading for each loop however) and does save instructions... I have found that normal shifts are fine for bitblts, even those on random boundaries and for random widths. I would suspect the additional overheads of determining when to use the 88K type bitfield operations would swamp the usefulness of them. -------- Rex di Bona (rex@cs.su.oz.au) Penguin Lust is NOT immoral