Xref: utzoo comp.software-eng:4425 comp.lang.c:33674 Path: utzoo!utgpu!cunews!mitel!melair!dataco!amodeo From: amodeo@dataco.UUCP (Roy Amodeo) Newsgroups: comp.software-eng,comp.lang.c Subject: Re: error handling techniques? Message-ID: <286@dcsun21.dataco.UUCP> Date: 11 Nov 90 23:15:05 GMT References: <1990Nov2.205831.23696@elroy.jpl.nasa.gov> <1990Nov3.153643.26368@clear.com> Reply-To: nrcaer!amodeo@dataco.UUCP (Roy Amodeo,DC ) Organization: Canadian Marconi Company (Datacomm), Ottawa, Ontario Lines: 107 In article <1990Nov3.153643.26368@clear.com> rmartin@clear.com (Bob Martin) writes: >In article <1990Nov2.205831.23696@elroy.jpl.nasa.gov> alan@cogswell.Jpl.Nasa.Gov (Alan S. Mazer) writes: >>I'm interested in what approaches people use for error handling, particularly >>in general purpose function libraries and large software systems. If someone > >Alan: > >In a large software system the number of places where the code can >detect errors can range into the tens of thousands. ... > >What I have done in the past to cope with this is to create an >ErrorLog function which will write single line error messages into >an error log file. ... > ... Errors of similar types should _not_ use the >same ! ... > >Every hour I close the current error log file and open a new one. >At the end of the day I compile then into a summary and eyeball >them to see if anything horrible went wrong. Software can be written >to automatically scan these logs to see if there are critical errors. > > Hope this helps. > I welcome discussion. You got it. I have some general comments about your scheme. The numbers would have to either be stored in one central place or you would need an allocation scheme that allocates blocks of numbers to various subsystems. Either way, this seems like an awful lot of work to set up. The manual nature of evaluating whether any serious errors have occurred bothers me ( unless you're the only one that runs your software ). It would require a rather intimate knowledge of the entire system. It also bothers me that the errors are "hidden" in a logfile (again assuming other people run your software). Out of curiosity, how big do your logfiles get? > >---------------------------------------------------------------------- >R. Martin >rmartin@clear.com >uunet!clrcom!rmartin >---------------------------------------------------------------------- And, in answer to Alan's original posting: I really like exceptions. I don't use them. Exceptions in C require writing an exception handling mechanism which I have never had the time to write for my own "small" programs. There are other systems I use which have used different error handling mechanisms from Day One and are "too big" to change now. All the code we write returns 0 on error. ( By never using '0' as an index, I can usually get away with this. ) Usually failures are trickled up to the function level where enough is known that they can be handled. The macro we use to "fail out" of a routine is called "FAILIF" and takes a condition, an error number and an error parm as parameters. If the condition is true, the error number is assigned the global variable errno, and the error parm is assigned to the global variable errparm. In addition, FAIL's behaviour can be modified to do a little cleanup before returning which solves one of the problems of multiple returns, although it is not that elegant. At higher levels, we will check for errors that we do not wish to handle ( like failures from malloc() ) by using fatal assertions. A fatal assertion asserts that a condition is true (nonzero), otherwise, it prints the string argument, hex dumps any areas of memory that the user wishes to dump, prints the errno, the name of the errno, the errparm, the line number, the file name, and function trace. ( This varies from the standard UNIX assert() mechanism. ) It then terminates the program. A non-fatal assert is also available for conditions that must be reported but need not be acted upon. Code using FAILIF and fatal assertions reads quite easily and is easy to write. You generally check a condition once, FAILIF or FASSERT it, and continue, secure that you are dealing with only valid values from here on in. Assertions should actually be coded in the interface to the routine because they can be valuable documentation, but we're not that sophisticated yet. To reduce code overhead, there are a number of functions whose failure is almost never handled (fclose, malloc, write, ... ). These functions are generally wrapped in envelopes that assert the success of the call. The user can then use the secure call if he wishes to program safely, or the lower level interface if he can handle the error himself or if he doesn't care. ( Apathy is the only good reason for ignoring return codes. ) One of the problems with the trickle-up method of subroutine failure, is that, often, you do not wish to decide on how fatal the error is at the lower level, and so the error trickles up to a much higher level where the severity is understood, but the exact condition which caused the error is lost. There are also cases where no one level contains all the info needed for a meaningful error message. One solution to this is to use a stack of errnos and errparms instead of single global variables. It also helps to have a user definable error string that is saved in this stack. As the error gets passed up the call chain more information is added. If the main program chooses to abort, the entire error stack can be dumped giving a complete description of the error. Although this generates really nicely detailed error messages with very little coding trouble, I have not used it on any programs that have enough levels of function calling to make it really worthwhile. Anyway, those are my experiences. And my code is usually a great test suite for error checking mechanisms! rba iv