Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Path: utzoo!mnetor!seismo!nbires!hao!gaia!zhahai
From: zhahai@gaia.UUCP (Zhahai Stewart)
Newsgroups: comp.sys.ibm.pc
Subject: Re: DMA
Message-ID: <135@gaia.UUCP>
Date: Fri, 14-Nov-86 17:08:40 EST
Article-I.D.: gaia.135
Posted: Fri Nov 14 17:08:40 1986
Date-Received: Sun, 16-Nov-86 01:10:27 EST
References: <1189@dataio.UUCP> <197@oliveb.UUCP>
Organization: Gaia Corp, Boulder, CO
Lines: 62
Summary: Two ways to copy EGA memory

In article <197@oliveb.UUCP>, spud@oliveb.UUCP (John E. Purser) writes:
> In article <1189@dataio.UUCP> bright@dataio.UUCP (Walter Bright) writes:
> >I am interested in copying pixel data from one page to another on the
> >IBM EGA. This involves moving 128k bytes of data. Doing it with a
> >REP MOVSW takes about 1/2 second (on an AT), which is too slow.
> 
> How did you arrive at the time of 1/2 second? The way I figure this
> it should only take about .05 seconds. According to the 286 programmers
> referance guide a REP MOVSW takes 5+(4*CX) clocks. In your example that would
> be 64k words times 4 plus 5 or a total of 262,149 clocks. The clock speed
> of the AT is 6Mhz so dividing the 262,149 by 6,000,000 leaves us with .045
> seconds. It may be that the video RAM is slow and requires a wait state
> or 2 on each access but thats a memory limitation and it won't help to use
> DMA in that case.
> 
First off, if you want to move 128K "pages", I presume that you are using the
EGA in the highest resolution, 640x350x16 colors, mode 10 (hex).  In this mode
the video refresh seems to eat up much of the memory bandwidth; thus the EGA
inserts wait states as needed until a free "access slot" is available to
service the processor - this happens even on a 4.77 MHz 8088 in the PC, not
to mention, for example, my 8 MHz 80286.  Because of this, and the very well
optimized string instructions on the 286, I doubt that DMA could do any faster
than CPU based moves, copying EGA->EGA.  (Even if you could get memory->memory
DMA working, that is).

You have two basic possibilities for EGA->EGA moves: plane by plane or all at
once.  For plane by plane, set the EGA to read a given plane (of 4), and to
write only to the same plane, do the copy (80x350 = 28KBytes) using MOVSW
with CX = 14000; then switch read and write enables to the next plane and
repeat.  This will be hampered by the fact that each 16 bit read or write
will actually be done as 2 back to back 8 bit writes (transparent to the
CPU - the EGA is an 8 bit card), each with several wait states, so this
will be considerably slower than the MOVSW calculation above (which assumed
real 16 bit transfers with 0 wait states).

The other way is to set up the EGA to write from its internal latches,
which hold 32 bits retrieved by the last EGA read (8 bits x 4 planes).
Then you do a 1 byte read from the source, and a 1 byte write (contents of
write do not matter, only the write strobe and address), in order to xfer
32 bits (1 byte x 4 planes).  You cannot do this word at a time because the
internal latches are only 8 bits wide.  So in this case you use a MOVSB
with CX=28000.  Each cycle transfers 32 bits with only two memory cycles
(and the corresponding wait states), as opposed to the first method which
transfers 16 bits/rep with 4 memory cycles and corresponding waits.  This
should be much faster; ironically, it should run at approximately the same
speed on a PC or AT, since the limitation is the EGA cycle stealing and
8 bit wide internal path.  Did that come across - the second technique
should work faster on a PC than the first does on an AT?

Also note that a full screen image in this mode only occupies 114 KBytes,
not 128 - so you can save another 10% or so if you only need to move the
visible image.

The exact ways to set up the EGA registers for this can be found in the
IBM manuals, or PC Tech Journal had an article, etc.  To much to go into
here and now.  Good luck.


-- 
--
Zhahai Stewart
{hao | nbires}!gaia!zhahai