Path: utzoo!mnetor!uunet!mfci!root
From: root@mfci.UUCP (SuperUser)
Newsgroups: comp.arch
Subject: Re: debugging on a VLIW machine
Message-ID: <365@m3.mfci.UUCP>
Date: 25 Apr 88 19:22:00 GMT
References: <8860@sol.ARPA>
Reply-To: genly@multiflow.com (Chris Hind Genly)
Organization: Multiflow Computer Inc., Branford Ct. 06405
Lines: 108
Keywords: vliw, debugger
Summary: The user is a def.
Sender:


In article <8860@sol.ARPA> bukys@cs.rochester.edu (Liudvikas Bukys) writes:
>I'm curious... to run a debugger on a Multiflow Trace, and make any sense
>out of what you see, I suppose that you have to compile it with trace
>scheduling and other hairy optimizations off?  Is this true?
>
>(I suppose one could automatically intersperse debugger code between
>every line of source code, and have the trace scheduler intertwine it
>so that a source level debugger could still be made to work.  Is this done?)

Short answer: Trace scheduling doesn't make the problem any worse.

Long answer:

During the debugging process users reason about their program at the
source level.  Any optimization which breaks the correspondence
between source code and object code is going to make the code
difficult to reason about while debugging.  It is the job of the
optimizers to determine which features specified by the source are
essential for correctness and which features are incidental.
However, when compiling for a source level debugger additional
contraints are placed on the compiler.  Features which were incidental
become essential.

One of the important properties users depend on while debugging is the
order of execution specified by the source.  If the compilation
process does not produce code which executes in the order specified by
the source, the user may find it confusing to debug.  Consider the
optimization of loop invariant motion.

	DO 10 I=1,10
		A = B(i) + F/Z;
10 	CONTINUE

Say F/Z is loop invariant.  The object code will correspond more to this:

	T = F/Z
	DO 10 I=1,10
		A = B(i) + T
10 	CONTINUE

Now say the user sets a break point at the entrance to the do loop.
If Z is zero, a fault will be generated before ever reaching the breakpoint.
Reasoning about the problem using the source code will lead to confusion.
"Why did the divide by zero fault occur on a line in the do loop before
the do loop was entered?"

There are other properties besides order of execution which the user depends
on.  Consider copy propagation and dead code removal in the following:

	A = 1
	CALL S(A)

After these optimizations the following would reflect the object code.

	CALL S(1)

By examining the source code, the user would expect to be able to set a
breakpoint at the call and change the value of A before the call is made.
But A is gone, so the user is confused again.

There's an interesting way to think about these optimizations.  The
optimizers must determine if certain criteria are met before the
optimization can be performed.  Copy propagation can take place only
if there is one reaching def at the call site.  Besides the reaching
def from "A=1" there is a reaching def from the user with the
debugger.  The same applies for loop invariant motion.  F/Z can be
moved out of the loop only if it is invariant.  If the program is
being compiled for debugging, than the optimizer can no longer assume
F, and Z are invariant, the user may change them with the debugger.
In a sense the user with the debugger is supplying
reaching defs for all user variables to everywhere in the program.
[The user can be thought of as a def. :-)]

So the classical optimizations are shut off when compiling for the
debugger because the debugger violates the criteria of these optimizations.
If the optimizations were performed anyway, the code would be difficult
to debug.

Notice that so far none of this has to do with trace scheduling.
Trace scheduling must determine which sequencing information presented
in the source program is incidental, and which is essential.  By
eliminating incidental sequencing information, the trace scheduler is
able to maximize parallelism.  But trace scheduling has exactly the
same problem.  Once the decision has been made to compile for the 
debugger, a great deal of sequencing information which was incidental
is now essential.

Having to make a decision to compile for the debugger, and accepting
a loss of performance when doing so, is accepted by users as standard
procedure.

Of course all of this has to do with source level debugging.  This is
what happens when using a debugger such as the Dbx under Unix.  If one
is willing to accept that the compiler will make transformations on
the source which will break the source to object correspondence then
an object level debugger can be used, such as Adb.

The OS group at Multiflow uses adb on optimized code with little
trouble.  The optimizers and the trace scheduler scramble the code
thoroughly.  Reasoning about an arbitrary point in a program using
the source code is difficult.  However, at the entry to each function
things are a little easier.  Often the state of global variables
and function arguments is enough of a clue to determine whats going on
in the kernel.  In addition, those that debug the kernel understand
the machine language and are sometimes willing to wade into a function.

----------------------------------------------------------------------
Chris Genly, genly@multiflow.com