Path: utzoo!utgpu!cunews!bnrgate!brtph3!brchh104!brchs1!bnr.ca!rice.edu!sun-spots-request
From: oj@saber.com
Newsgroups: comp.sys.sun
Subject: Re: Accurate time measurements on SS1+, 3/260, etc.
Keywords: Miscellaneous
Message-ID: <999@brchh104.bnr.ca>
Date: 1 Jan 91 00:33:11 GMT
Sender: news@brchh104.bnr.ca
Organization: Sun-Spots
Lines: 72
Approved: Sun-Spots@rice.edu
X-Refs:  Original: v9n407
X-Sun-Spots-Digest: Volume 9, Issue 415, message 13
X-Note: Submissions: sun-spots@rice.edu, Admin: sun-spots-request@rice.edu

Mark wrote that he's trying to make function call timings by using code
like the following, and he can't get enough resolution out of the system
clock.

       ...
       get_current_time(&start);
       function_to_be_timed();
       get_current_time(&stop);
       function_time = stop - start;
       ...

I've done this sort of measurement several times, and I've always used a
method like this:

       get_current_time(&start);
       for (i = 0 ; i < COUNT ; i++ ) {
         ...
         function_to_be_timed();
         ...
       }
       get_current_time(&stop);
       experiment_time = stop - start;

       get_current_time(&start);
       for (i = 0 ; i < COUNT ; i++ ) {
         ...
         /* function_to_be_timed(); */
         ...
       }
       get_current_time(&stop);
       control_time = stop - start;
       total_function_time = experiment_time - control_time;
       function_time = total_function_time / COUNT;

I've gotten best results when I chose a COUNT value which caused the
total_function_time to be at least 100 of whatever clock ticks the machine
provides.  This allows an accuracy of about 2% in the final function_time
values.

It's also a good idea to fill the ... parts of the code in with operations
which successfully flush the caches, to avoid falsely low readings caused
by hammering on the exact same code over and over.  You'll have to
experiment with this.  For example, try summing up the elements of a large
array, and keep making the array larger until the function_time stops
increasing.

Another good control is to run the measurements three or four times, and
make sure you're getting repeatable results.

Also, you could measure the individual function times, and compute a mean
and standard deviation from the individual times.  This is most accurate,
but most painful.

Beware smart optimizers!  Make sure you get a decent "dose-response"
curve...that is, that a linear increase in COUNT causes a linear increase
in total_function_time.

This methodology has been accurate enough in the past for me to discover
such things as a single extra machine cycle out of a couple of hundred in
various functions' code.

   If all else fails I will have to resort to a sbus/vme - PC interface and
   read real time from a PC card :-( Can someone *please* save me from this
   fate!

Shouldn't be any need.  Plus, even if you do this, you'll still have to
repeat the measurements several times and divide out the result to assess
your accuracy.

Ollie Jones             Saber Software, Inc.       oj@saber.com
Saber-C Project Leader  185 Alewife Brook Parkway  uunet!saber.com!oj
+1(617)876-7636         Cambridge, MA 02138-9887   fax +1(617)868-9205