Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!rutgers!iuvax!purdue!mentor.cc.purdue.edu!pur-ee!hankd From: hankd@pur-ee.UUCP (Hank Dietz) Newsgroups: comp.lang.fortran Subject: Re: Fxref & Flink -- deficiencies? Message-ID: <12112@pur-ee.UUCP> Date: 4 Jul 89 17:04:47 GMT References: <1989Jul3.125106.27708@cs.dal.ca> Reply-To: hankd@pur-ee.UUCP (Hank Dietz) Organization: Purdue University Engineering Computer Network Lines: 64 In article <1989Jul3.125106.27708@cs.dal.ca> silvert@cs.dal.ca (Bill Silvert) writes: >Mike Fischbein has pointed out to me that the fxref and flink utilities >which I have written and distributed do not handle suitably obfuscated, >but perfectly legal, Fortran code. For example, they cannot find >the correct variable names in the following lines of code: > > DO 100 I = 1.5 > DO100I = 1,5 > >and Mike wonders whether lex-based analyzers can handle Fortran syntax. > >As the above example shows, lexical analysis of Fortran requires >complete analysis of each line. Ignoring blanks is easy (just modify >input.h as distributed with flink and fxref to skip blanks), but a >complete analytical tool would have to include the complete Fortran >parser. Therefore I always use white space to separate tokens in my >code, and the tools I develop use this to simplify the task. If you >don't insert white space, my tools won't help you. It is not possible to use lex without a wee bit of help, despite the suggestion (in Aho & Ullman, Principles of Compiler Design, page 108) that simple lookahead scanning for a "," would suffice. The reason is simple; one must count nesting level for parens to determine if commas are enclosed within parens. This can't be done by a "pure" DFA recognizer. For example: DO 10 I=A(1,10) DO 10 I=B(C(1),10) DO 10 I=D(A(1,((10))), C(B(1,10)), 5) are all assignments to the variable "DO10I", whereas: DO10I = C(1),10 DO10I = A(1,10),((B)) DO10I = (E+A(1,((10)))), (C(B(1,10))- 5) are all DO loop headers. Yuck. By the way, even fairly reasonable folk sometimes rely on spaces not being separators: GO TO 10 instead of: GOTO 10 and the variable: I LIKE FORTRAN seems much friendlier than: ILIKEFORTRAN Remember, Fortran doesn't allow "_" in variable names. Personally, despite the above examples, I think this rule of Fortran 77 does great harm to the readability of one's code because it encourages inconsistent use of spaces, as well as making the compiler noticibly more ad-hoc. I'd like to see this "feature" go away... is it still in 8x? -hankd PS: I know removing this feature would break old code, but it is not all that difficult to write an ad-hoc program which will automatically "clean up" the spacing of old code.