Xref: utzoo comp.lang.misc:1474 comp.lang.pascal:826 Path: utzoo!mnetor!uunet!lll-winken!lll-lcc!ames!oliveb!intelca!mipos3!omepd!bobdi From: bobdi@omepd (Bob Dietrich) Newsgroups: comp.lang.misc,comp.lang.pascal Subject: Re: Threatening Pascal Loops Message-ID: <3401@omepd> Date: 21 Apr 88 00:23:36 GMT References: <2827@enea.se> <1557@pasteur.Berkeley.Edu> <2773@mmintl.UUCP> <294@tmsoft.UUCP> <11047@shemp.CS.UCLA.EDU> <3364@omepd> <11369@shemp.CS.UCLA.EDU> Reply-To: bobdi@omepd.UUCP (Bob Dietrich) Organization: Intel Corp., Hillsboro, Oregon Lines: 134 In article <11369@shemp.CS.UCLA.EDU> gast@lanai.UUCP (David Gast) writes: > ... >Dietrich goes on to explain what threatened means. Are there any >compilers that detect every possible instance of a threatened >variable? Yes, there are several. The rules were defined so that the violation is easily detectable at compile time. In exchange, the rules are stricter than what would be required if violations were checked at runtime (i.e., not all threats to control variables are actually harmful). > >Presumably there must be some other verbage to the extent >that an index variable cannot be accessed after the loop ends. The control variable is undefined at the end of a for-statement, as long as the loop has not been left by a goto-statement. More below. >The following program is also illegal, but again, I suspect that >most compilers do not detect the error. The standard Berkeley 4.3 >compiler does not. There's a lot the Berkley compiler doesn't do. > > [code deleted] >Essentially, slightly over stated, one will have to guarantee that >a loop control variable is never used outside the scope of the for >loop or do extensive run-time checking. > >Of course, you can put extra checks into the compiler to try and detect >these errors. These checks take time and make the compiler bigger. Yes, I suppose you could do data-flow analysis of programs. I know of one Pascal compiler that did: it was indeed very expensive and helped bring about the concept of threatening. If you introduce a separate compilation facility, things get much worse. Unfortunately, you are only really addressing part of the more general problem of use of undefined variables. Pascal has one of the few language specifications around that actually discusses the concept of undefined variables. It specifies when variables become defined (with a value) and when they become undefined. In general, it is an error to use ANY undefined variable in an expression. The example given just happens to use one of the ways a variable may become undefined. If you delete the for-statement altogether, leaving the variable i uninitialized, you may see what I'm getting at. In practice, however, what is definable in an axiomatic manner in the specification is usually not implemented. To do proper detection of undefined variables, you typically need a tagged architecture, or must put a little more effort into you Pascal processor (compiler+runtime or interpreter). Most architectures that are in popular use do not have a mechanism to tag a variable as being undefined (there is no such thing as an "undefined value"!). Lacking such aids, a processor must (optimally) do some flow analysis to try and catch use of undefined variables at translation time, and generate checks for those cases that are not determinable until runtime. This involves at least a bit per variable, which must be passed around wherever the variable is referenced. Not too bad for normal variables, but when you consider that an array is undefined unless all its components are defined, things can escalate quickly. What you end up with is translation time analysis, runtime checks, and some potentially large additional data structures. Most people apparently don't feel the results are worth the expense, since it isn't commonly implemented. Given how many times I've seen myself or others chase down bugs caused by uninitialized or undefined variables, it's a shame. > >As long as you end up (de facto) not allowing the loop control variable >to be used outside the scope of the for loop, why not do the sensible >thing in the first place? Why not decide *in the language definition* >that the loop control variable is local to the loop? That is the >decision Algol 68 made and it works. It is impossible to assign to a >loop control variable in Algol 68. And no concept like "threatening" >is needed so it is easier to learn. > >One could have the following syntax: > > for FOR-CONT-VAR : S-TYPE := LB to UB do STATEMENT This alone does not cure the problem, because what if STATEMENT is an invocation of the read procedure, an assignment to the control variable, or a procedure call that passes the control as a variable parameter? You still need rules similar to the threat concept. Furthermore, you have now introduced a brand new place that variables can be declared. Not necessarily evil, but another concept to specify and explain. > >The one argument against this change is that such a change would make >previously valid pascal programs illegal. But the use threatening also >makes previously, valid pascal programs illegal. That is, a loop >control variable can be threatened without actually being assigned to. >Such a program would not have been illegal under the old standard, but >it is, as I understand it, under the new standard. Just to be clear, there is only one official standard for Pascal right now, embodied in the ANSI/IEEE and ISO standards. Extended Pascal is not yet a standard, as it is still under development, but nearing completion of this go-around. The concept of threatening, however, is in the current standard. The only changes in Extended Pascal (if there are any, I can't remember) are for new language features. > >The old standard Pascal had many type insecurities. Perhaps the new >standard eliminates all of these; probably it doesn't. As I no longer >have to use Pascal, I do not care to investigate myself. If the new >Pascal, however, does not have any type insecurities, then it is a far >different language. The defining document and the compilers and run time >systems are also undoubtedly much bigger as well. If by the "old standard" you mean the Pascal User Manual and Report by Jensen and Wirth, I agree that this was a de facto standard and had many problems. Hence the current standard, which specifies name type compatibility (J&W was foggy on this point). Other than type compatibility, I know of no other "type insecurities", which is an entirely different subject than the one we have been discussing. BTW, the Third Edition of J&W was extensively revised to incorporate decisions made in the standard. As far as "new Pascal" goes, Extended Pascal does not replace the current standard. They will co-exist as long as there is a desire to keep both alive. If you want a simpler language, use the current standard. If you want the features people have been frequently adding to the language, like modularity, string handling, etc., use Extended Pascal. Either way, you pay for what you get (or don't get). Furthermore, Extended Pascal is upward compatible with the current standard, unless you happen to have a variable called "module" or one of the few other new reserved words. Perhaps this is a bit late to say this, but I think I agree with your aims of security and simplicity. I've just been trying to jive reality with what you said, and point out some of the problems of achieving those aims. I like my programs to work; that's why I avoid using C whenever possible. Bob Dietrich Intel Corporation, Hillsboro, Oregon (503) 696-4400 or 2092(messages x4188,2111) usenet: tektronix!ogcvax!omepd!bobdi or tektronix!psu-cs!omepd!bobdi or ihnp4!verdix!omepd!bobdi