Path: utzoo!attcan!uunet!lll-winken!csd4.milw.wisc.edu!mailrus!tut.cis.ohio-state.edu!ucbvax!BRL.MIL!mike From: mike@BRL.MIL (Mike Muuss) Newsgroups: comp.sys.sgi Subject: Re: lrectwrite & gsync? Message-ID: <8903032345.aa14962@SEM.BRL.MIL> Date: 4 Mar 89 04:45:03 GMT Sender: daemon@ucbvax.BERKELEY.EDU Organization: The Internet Lines: 96 Mark - Thanks for your detailed and informative note. From what you say, then the 60% SYS time must be DMA setup, and the 40% IDLE time must be the actual DMA transmission time. I was seeing 1000 scanlines/second. If that translates to 1000 syscalls and interupts per second, then I can understand the significant overhead that I was encountering. I guess I would like the opportunity to vary the pipe_write / DMA crossover point in my application, to see if I can produce faster screen updates. The SGI evaluation to set the threshold may not have taken the system overhead fully into account. THE BIG PICTURE Let me also take this opportunity to tell you what I need to do; perhaps you can suggest some different strategy that may achieve higher performance. I have a shared memory segment that is organized as 1024 scanlines of 1280 pixels of 4 bytes each (SGI AlphaBGR format for lrectwrite). The arrangement of this data must be fixed, regardless of what sub-rectangle of it is presently of interest. If it would help any, I can change the internal organization any way I like, subject to the previous constraint. When the application is using the full screen, then this entire array is written with a single call to lrectwrite(), with delightfully good performance. When the application is using a smaller window, it presently drops back to a loop which calls lrectwrite() once per scanline. Here is the actual code fragment: /* Simplest case, nothing fancy */ y = ybase; if( !sw_zoom && !sw_cmap ) { if( ifp->if_width == SGI(ifp)->mi_memwidth ) { /* This one is very fast */ lrectwrite( SGI(ifp)->mi_xoff+0, SGI(ifp)->mi_yoff+y, SGI(ifp)->mi_xoff+0+ifp->if_width-1, SGI(ifp)->mi_yoff+y+nlines-1, &ifp->if_mem[(y*SGI(ifp)->mi_memwidth)* sizeof(struct sgi_pixel)] ); return; } for( n=nlines; n>0; n--, y++ ) { lrectwrite( SGI(ifp)->mi_xoff+0, SGI(ifp)->mi_yoff+y, SGI(ifp)->mi_xoff+0+ifp->if_width-1, SGI(ifp)->mi_yoff+y, &ifp->if_mem[(y*SGI(ifp)->mi_memwidth)* sizeof(struct sgi_pixel)] ); /* XXX big performance hit here. * GTX is limited to about 1000 lrectwrites/sec, * due to some library synchronization mechanism * that burns 60% of the CPU in sys-time. ?!?! */ } return; } So, what I really want to do is write a RECTANGLE from my buffer to a RECTANGLE on the screen, more in the style of rectcopy(). Does the 4D architecture offer me a way of doing this? I can imagine several possibilities: 1) a subroutine, perhaps: lrectwriterect( x1,y1, x2, y2, pixel_p, mem_width, mem_skip ) which would use mem_width pixels, then skip mem_skip pixels, and repeat. This would be perfect. 2) A subroutine modeled on the Berkeley writev() call that would take an array of structures roughly like this (any reasonable layout is fine with me): struct fast_pixel_cmds { int xscr_base; int yscr_base; struct sgi_pixel *pixel_p; int count; } array[MAX_CMDS]; fast_pixel_write_v( &array[0], cmd_count ); 3) A "vector" version of lrectwrite() that looked something like this: struct lrectwrite_vector { int xscr_base, yscr_base; int xscr_max, yscr_max; struct sgi_pixel *pixel_p; } array[MAX_CMDS]; lrectwrite_v( &array[0], cmd_count ); Any suggestion at all that you might have will be greatly appreciated! Thanks, -Mike