Path: utzoo!utgpu!jarvis.csri.toronto.edu!rutgers!tut.cis.ohio-state.edu!MCC.COM!rfg From: rfg@MCC.COM (Ron Guilmette) Newsgroups: gnu.g++ Subject: Compiler reliability & testing Message-ID: <8906201624.AA12323@pink.aca.mcc.com> Date: 20 Jun 89 16:24:22 GMT Sender: daemon@tut.cis.ohio-state.edu Distribution: gnu Organization: GNUs Not Usenet Lines: 92 Recently, Eugene Brooks wrote: > Several posters have indicated, and later strongly supported, > statements about reliability problems with GCC. As you might > guess at this point, the reliability problems are real. > > The reason why new releases are often flakey is probably > the lack of sufficient regression testing. Congratulations Mr. Brooks! Why don't you tell us that it gets cooler at night than it is during the daytime, or that Arabs don't like Israelies. Sorry. I'm not attacking you or your statement, but I would be surprized if anybody *DIDN'T* understand that this is exactly and precisely THE MAIN THING that is holding back the relaibility of both GCC and G++ (not to mention GDB, for which there could also be automated tests). > A regression > test package should be built and distributed as part of > GCC. Amen. Portable compilers have a higher than normal need for complementary validation suites. Such a suite (for GCC) could significantly reduce the uncertainty left after an initial port to a new machine is done. > Before each publicly announced release FSF should > ship the new release to several friendly sites which will > run the regression tests on hardware FSF does not > have direct access to. This is happening now, except that rms does *not* give people any special guidance on *what* to use as a test. Usually, the pre-testers just use their favorite packages. If there are enough pre-testers, this can give good coverage, but it is still too haphazard for my tastes. > The major problem with regression > test packages is the sheer volume of code which must be > written, with the writer thinking that he is not doing something > useful (like working directly on the compiler). The solution > to this problem is to request that bug reports include if > possible a test program which is in the standard regression > test format. I have been slowly building up test suites for G++ and GCC (i.e. C++ and C). I now have an automated means of executing all of the test cases in the suites, analyzing the outcomes of each test, and reporting the results. The "driver" routines are Borne shell scripts. They were originally written as C-shell scripts, but I converted them so that AT&T could run my C++ tests also. I have also encouraged at least one person to convert his base of tests to my format. I will gladly publish my format here (and accept new contributions) if that will help. This approach, of building up tests based on bugs found, is certainly better than nothing, but a far more robust suite could be built through a concerted effort based on an analysis of the ANSI standard (as was done in the case of ADA). This could take up an enormous amount of manpower (and would probably have a non-trivial cost in $$), but if the resulting suite were to be placed into the public domain then everyone would benefit greatly. > A single program which runs and produces the message PASSED or FAILED > on standard output. The regression test driver can trigger on these > keywords and inform the tester. Liberal use of tests for failure > with suitable printing of line and file numbers of where the compilation > failure occured. A pointer to the source of the test program in case help > is required in bug shooting the compiler. My test suite drivers now do all of the above, except that I have decided that it is simpler and easier to have the executable tests simply return a zero or non-zero exit code to indicate pass/fail results. > With all the bug reports which are coming in, it should be no time > at all that we have a large and useful regression test library for > GCC. At least we wont have repeat visits on previous bugs. I have *not* been collecting GCC bugs reports very intensely, but I have got a nice set of tests for G++. One thing worth noting is that I always try to get the tests themselves down to less that 100 lines. Thus, when I see a G++ bug report (usually from an novice) come across on bug-g++, and I see that the mail message is greater than about 10K long, I don't even bother with it, because it is too much work to hack it down to size and to figure out what it is *supposed* to do when it is working correctly. // Ron Guilmette - MCC - Experimental Systems Kit Project // 3500 West Balcones Center Drive, Austin, TX 78759 - (512)338-3740 // ARPA: rfg@mcc.com // UUCP: {rutgers,uunet,gatech,ames,pyramid}!cs.utexas.edu!pp!rfg