Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!ames!lll-winken!uunet!ncrlnk!ncrcae!hubcap!John From: ames!mailrus!BBN.COM!jr@uunet.UU.NET (John Robinson) Newsgroups: comp.parallel Subject: Re: Opinions on Debugging Parallel Programs Message-ID: <4907@hubcap.clemson.edu> Date: 25 Mar 89 23:46:54 GMT Sender: fpst@hubcap.clemson.edu Lines: 38 Approved: parallel@hubcap.clemson.edu In article <4821@hubcap.UUCP>, hammonds@riacs (Steve Hammond) writes: >I think a useful tool would be something that captured the order >of "events" to make a MIMD program have a repeatable order of >execution. When I am debugging I want a deterministic sequence >of events. For example, I want processes to finish tasks in the same order. >I believe something like this was being worked on at U. Rochester. >I think that one of the people involved was Tom LeBlanc if you >want to check into it. It was being developed on their 128 node butterfly. >Anyone know the status of this? BBN now has an event logger and display system for the Butterfly, called gist. You capture events by logging them locally in each processor. Then collect the logs and post-process them with a nice interactive graphical front-end that lets you display selected event types, selected processors, various time grain, etc. Don't know if it builds on the Rochester work or is home grown. >>If you have used any existing parallel debugger systems, either >>commercial or experimental, could you name them and give me some >>feedback on their usefulness? Though I haven't used the Buttterfly version, we built a simulator for the Monarch that used a similar mechanism, enhanced to catch things like switch and memory contention (and optionally make them into events). Very nice for elucidating algorithms and problems with their memory referencing behavior. Our view (well, one view here) on debugging parallel programs is that it takes three steps: 1. Debug it on one processor. 2. Debug it on two processors. 3. Run it on N processors. Very few bugs show up here, though you may run into hot spots and strange N-processor race effects. -- /jr jr@bbn.com or bbn!jr C'mon big money!