Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!samsung!noose.ecn.purdue.edu!mentor.cc.purdue.edu!l.cc.purdue.edu!cik
From: cik@l.cc.purdue.edu (Herman Rubin)
Newsgroups: comp.arch
Subject: Re: End-of-buffer interrupt instruction
Message-ID: <2577@l.cc.purdue.edu>
Date: 18 Sep 90 18:14:07 GMT
References: <2516@l.cc.purdue.edu> <6838.26e7f109@vax1.tcd.ie> <2123@key.COM> <WAYNE.90Sep17125041@dsndata.uucp>
Organization: Purdue University Statistics Department
Lines: 61

In article <WAYNE.90Sep17125041@dsndata.uucp>, wayne@dsndata.uucp (Wayne Schlitt) writes:
> In article <2567@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes:

			......................

| > The advantage of an interrupt procedure over a test each time is that
| > the interrupt is rare, and does not have the costs of the test when it
| > is not invoked.  It is likely to lengthen the instruction, however.  On
| > the other hand, an interrupt certainly has much higher costs than the
| > test when it does cut in.
 
 
> the more i think about it, the less i see any value of this type of
> instruction.  IFF your processor has a deep pipeline and/or it doesnt
> stall on loads from memory AND your buffer isnt in cache, then your
> compare is going to be totally hidden by the delay of loading from
> memory.  let's look at the code...
 
 
> you want something like this:
 
> top_of_loop:
>         add     R1,R0                   # do something with R0        
>         loadbuf R0,buffer,endofbuf      # fetch something from the buffer
>                                         # check to make sure it isnt past
>                                         # endofbuf and increment buffer to
>                                         # to the next item
>         bra     top_of_loop             # do some more processing.
> 
> 
> instead, what you are getting is:
 
> top_of_loop:
>         add     R1,R0                   # do something with R0
>         load    R0,(buffer)             # fetch something from the buffer
>         cmp     R0,endofbuf             # see if we are at the end of buf
>         ble     top_of_loop             # if not, go process it
>         inc     buffer                  # ye, old, delay slot of the branch
 
> since R0 isnt going to be ready to use for several instructions after
> the load, your code is simply going to stall on the add at the top of
> the loop.  the code you are getting is going to be doing the compare
> instead of stalling.  it dont see what the difference is.  your loop
> isnt going to run any faster...

Why should I not run other instructions while waiting for the memory access?
I am not reading a whole buffer; I am reading one item, and doing something
with it.  If I were using such a short loop, I would definitely decide if
the number of items to be read is more or less than what is in the buffer,
and write my code accordingly.

I am quite aware of the problems of programming when instructions can run
in parallel, and I do write my code to take advantage of it, sometimes even
when the language does not have any way for me to tell the compiler to take
advantage of it.


-- 
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907
Phone: (317)494-6054
hrubin@l.cc.purdue.edu (Internet, bitnet)	{purdue,pur-ee}!l.cc!cik(UUCP)