Path: utzoo!utgpu!news-server.csri.toronto.edu!bonnie.concordia.ca!uunet!wuarchive!zaphod.mps.ohio-state.edu!rpi!uupsi!sunic!cs.umu.se!dvljrt From: dvljrt@cs.umu.se (Joakim Rosqvist) Newsgroups: comp.sys.amiga.programmer Subject: Re: 3D stuff & quick line drawing Message-ID: <1991Apr29.110534.10198@cs.umu.se> Date: 29 Apr 91 11:05:34 GMT References: <1991Apr22.122442.25505@cs.umu.se> <20884@cbmvax.commodore.com> <00672624939@elgamy.RAIDERNET.COM> Sender: news@cs.umu.se (News Administrator) Organization: Dep. of Info.Proc, Umea Univ., Sweden Lines: 61 >WaitBlit() isn't *THAT* bad! The instructions jsr waitblit(a6) and the rts in the end of that routine takes 6 us to execute all by themselves and I had 23 us setup- It would at least double my setup-time to use waitblit. (I'll disassemble and see what it really does) >> With the blitter i got 738000 pixels/second and 23 microsecond startup-time >> all take the same time. The is beacuse the setup-time was really 58 micros >> but for longer lines some of this could be done while the blitter was >> drawing the previous line. >> With the 68000 i could get about 150000 pixels/sec and 24 micros 'setup' > >Note that using the blitter to draw lines into a four bitplane screen takes >four times longer than into one bitplane. Four times the setup. Four times >the drawing. Meanwhile, extending your one-bitplane CPU-driven whatzit to >four bitplanes basically consists of putting in three more BSET >instructions into your loop. Which does NOT increase the overhead by four >times, and affects setup time very little (basically have to MOVEM four >bitplane pointers into address registers, rather than a single bitplane >pointer). You might end up with, say, 170000 pixels/second + 92 microsecond >setup time for the blitter version, and 130000 pixels/sec with 25 us setup >for the CPU version. The tradeoffs, for short lines, then become much more >obvious, though the blitter is still fastest for long lines on >unaccellerated Amigas. (on accelerated ones, no way!). > >I *DO* assume that you're using assembly language for all of this? > Is there anything else :-) Interesting with 4-bitplane lines.. I find 738000/4=185000 pix/sec And for the CPU it will slow down much more than you say, ironically enough the reson is that the routines is so optimized that the BSET is actually a quite large piece in the inner loop. As for now I have about 45 cycles per pixel (292 degrees) It would increase to about 80 in 4 bitplanes giving 86000 pix/sec. Still big advantages using the blitter. Here is my inner loop: BSET d7,(a0) ;Plot pixel. 12 cycles ADD d6,a0 ;Go to the next scanline. 8 cycles SUB d3,d5 ;This routine always goes one pixel down and sometimes ;(every second time for 292 deg) one pixels right. ;d3 is MIN(dx,dy) that is dx in this case. 4 cycles BPL.S over ;See if it is time to go right. 9 cycles ;(8 or 10 depending on wheter it branches) ADD d4,d5 ;d4 is MAX(dx,dy)=dy in this case. 4/2=2 cycles SUBQ #1,d7 ;One pixel right 4/2=2 cycles BPL.S over ;If 0