Path: utzoo!mnetor!uunet!mfci!root From: root@mfci.UUCP (SuperUser) Newsgroups: comp.arch Subject: Re: debugging on a VLIW machine Message-ID: <365@m3.mfci.UUCP> Date: 25 Apr 88 19:22:00 GMT References: <8860@sol.ARPA> Reply-To: genly@multiflow.com (Chris Hind Genly) Organization: Multiflow Computer Inc., Branford Ct. 06405 Lines: 108 Keywords: vliw, debugger Summary: The user is a def. Sender: In article <8860@sol.ARPA> bukys@cs.rochester.edu (Liudvikas Bukys) writes: >I'm curious... to run a debugger on a Multiflow Trace, and make any sense >out of what you see, I suppose that you have to compile it with trace >scheduling and other hairy optimizations off? Is this true? > >(I suppose one could automatically intersperse debugger code between >every line of source code, and have the trace scheduler intertwine it >so that a source level debugger could still be made to work. Is this done?) Short answer: Trace scheduling doesn't make the problem any worse. Long answer: During the debugging process users reason about their program at the source level. Any optimization which breaks the correspondence between source code and object code is going to make the code difficult to reason about while debugging. It is the job of the optimizers to determine which features specified by the source are essential for correctness and which features are incidental. However, when compiling for a source level debugger additional contraints are placed on the compiler. Features which were incidental become essential. One of the important properties users depend on while debugging is the order of execution specified by the source. If the compilation process does not produce code which executes in the order specified by the source, the user may find it confusing to debug. Consider the optimization of loop invariant motion. DO 10 I=1,10 A = B(i) + F/Z; 10 CONTINUE Say F/Z is loop invariant. The object code will correspond more to this: T = F/Z DO 10 I=1,10 A = B(i) + T 10 CONTINUE Now say the user sets a break point at the entrance to the do loop. If Z is zero, a fault will be generated before ever reaching the breakpoint. Reasoning about the problem using the source code will lead to confusion. "Why did the divide by zero fault occur on a line in the do loop before the do loop was entered?" There are other properties besides order of execution which the user depends on. Consider copy propagation and dead code removal in the following: A = 1 CALL S(A) After these optimizations the following would reflect the object code. CALL S(1) By examining the source code, the user would expect to be able to set a breakpoint at the call and change the value of A before the call is made. But A is gone, so the user is confused again. There's an interesting way to think about these optimizations. The optimizers must determine if certain criteria are met before the optimization can be performed. Copy propagation can take place only if there is one reaching def at the call site. Besides the reaching def from "A=1" there is a reaching def from the user with the debugger. The same applies for loop invariant motion. F/Z can be moved out of the loop only if it is invariant. If the program is being compiled for debugging, than the optimizer can no longer assume F, and Z are invariant, the user may change them with the debugger. In a sense the user with the debugger is supplying reaching defs for all user variables to everywhere in the program. [The user can be thought of as a def. :-)] So the classical optimizations are shut off when compiling for the debugger because the debugger violates the criteria of these optimizations. If the optimizations were performed anyway, the code would be difficult to debug. Notice that so far none of this has to do with trace scheduling. Trace scheduling must determine which sequencing information presented in the source program is incidental, and which is essential. By eliminating incidental sequencing information, the trace scheduler is able to maximize parallelism. But trace scheduling has exactly the same problem. Once the decision has been made to compile for the debugger, a great deal of sequencing information which was incidental is now essential. Having to make a decision to compile for the debugger, and accepting a loss of performance when doing so, is accepted by users as standard procedure. Of course all of this has to do with source level debugging. This is what happens when using a debugger such as the Dbx under Unix. If one is willing to accept that the compiler will make transformations on the source which will break the source to object correspondence then an object level debugger can be used, such as Adb. The OS group at Multiflow uses adb on optimized code with little trouble. The optimizers and the trace scheduler scramble the code thoroughly. Reasoning about an arbitrary point in a program using the source code is difficult. However, at the entry to each function things are a little easier. Often the state of global variables and function arguments is enough of a clue to determine whats going on in the kernel. In addition, those that debug the kernel understand the machine language and are sometimes willing to wade into a function. ---------------------------------------------------------------------- Chris Genly, genly@multiflow.com