Path: utzoo!attcan!lsuc!maccs!cs4g6ag
From: cs4g6ag@maccs.dcss.mcmaster.ca (Stephen M. Dunn)
Newsgroups: comp.sys.ibm.pc.programmer
Subject: Re: Help needed
Summary: some suggestions for speeding up a program
Keywords: optimization
Message-ID: <2622B884.144@maccs.dcss.mcmaster.ca>
Date: 11 Apr 90 04:54:28 GMT
References: <866@dukempd.phy.duke.edu>
Organization: McMaster University, Hamilton, Ontario
Lines: 63


   I'm not familiar with Quick C, but I don't believe it does much
in the way of optimization.  If that's correct, you may be able to
speed things up a bit by using MSC or other optimizing compilers.

Now on to the program itself:

In article <866@dukempd.phy.duke.edu> fang@dukempd.phy.duke.edu (Fang Zhong) writes:

$		for(k = 0; k < row; ++k) {
$			for(j = 0; j < col; ++j) {
$				RDPXL(&j, &k, &pval);
$				ival[k+1][j+1] += pval;
$			}
$		}

   The k+1 and j+1 in the final statement above are time-wasters.
The k+1 only needs to be performed when k is incremented, and the
j+1 could be changed so that it actually updates j, since the next
thing that is done is the ++j in the for statement:

	for (k = 0; k < row;++ k)
	{
		register int	temp = k + 1;

		for (j = 0; j < col;)
		{
			RDPXL (&j, &k, &pval);
			ival [temp] [++ j] += pval;
		}
	}

   Any optimizing compiler should be able to remove the loop invariant
k+1 from the inner loop.  I don't know how many will change the
code involving j, though.

   In any case, though, the biggest time-waster in the code you have
is the overhead involved in calling RDPXL (row * col) times.  Each
time, the arguments have to pushed onto the stack before the call
and popped afterwards.  If these are calls to subroutines provided
with the board, you have one (maybe two) choices:

- write an assembly language routine that you call once for each
  column that does the calls itself and returns a whole column at
  once

- see if there's such a call provided for you

   If you wrote RDPXL yourself, you should seriously consider writing
a new routine that returns a whole column or row at a time.

   So the first suggestion basically involves optimizing the code
yourself instead of hoping the compiler will do so (you'd be
surprised at how much faster a program can run if you remove stuff
from loops that only needs to be done once!), and the second
involves trying to cut down the overhead of function calls.
These are both techniques that are generally applicable, especially
the first one when using non-optimizing compilers.
-- 
               More half-baked ideas from the oven of:
****************************************************************************
Stephen M. Dunn                               cs4g6ag@maccs.dcss.mcmaster.ca
     <std_disclaimer.h> = "\nI'm only an undergraduate ... for now!\n";