Path: utzoo!attcan!uunet!lll-winken!ames!mailrus!uflorida!gatech!hubcap!utter From: utter@tcgould.tn.cornell.edu (Paula Sue Utter) Newsgroups: comp.parallel Subject: Opinions on Debugging Parallel Programs Message-ID: <4810@hubcap.UUCP> Date: 16 Mar 89 15:40:43 GMT Sender: fpst@hubcap.UUCP Lines: 55 Approved: parallel@hubcap.clemson.edu One of the hot topics of research today is how to debug parallel programs, both on shared memory multiprocessors and on distributed memory machines. Often it seems these debugger systems are developed more for ease of implementation rather than for providing maximum utility and ease of use. What I'd like to do is get some opinions on just what sort of features a good debugger system for parallel programs should provide. The kind of information I'm looking for includes: Is it harder to write and debug new parallel programs, or to parallelize "dusty deck" serial programs? What are the most common bugs you have encountered during parallel programming development and production runs (e.g., unintentional change to a shared variable, etc.). What methods have you used in your attempts to debug parallel programs? Of these, which were most successful? What types of tools do you think would be helpful in developing and debugging parallel programs? (For example, would it be helpful to observe sequential execution within each process executing in parallel?) What important events or features should be displayed when representing parallel program execution? (Some suggestions might be synchronization mechanisms, interprocess communication patterns, updates to shared variables, etc.) When working with parallel programs, people often employ graphical representations that reflect their mental model of the problem at hand. Could you give me a verbal description of the way you envision such things as: Parallel execution Interprocess communication Synchronization schemes Since many people now include performance evaluation and improvement as part of the debugging process when dealing with parallel programs, what type of information would be useful in this area? If you have used any existing parallel debugger systems, either commercial or experimental, could you name them and give me some feedback on their usefulness? I'd really appreciate any opinions you have on this matter. Please send responses to utter@tcgould.tn.cornell.edu. If I get enough responses, I'll compile the results and post them here. Thanks in advance. Sue Utter Technology Integration Group Cornell National Supercomputer Facility