Path: utzoo!attcan!uunet!snorkelwacker!spdcc!esegue!johnl From: johnl@esegue.segue.boston.ma.us (John R. Levine) Newsgroups: comp.arch Subject: Re: Algol was an advance, was He's not the only one at it again! Summary: argument about language architecture Message-ID: <1990Jul30.174035.26412@esegue.segue.boston.ma.us> Date: 30 Jul 90 17:40:35 GMT References: <1288@s8.Morgan.COM> <58372@lanl.gov> Reply-To: johnl@esegue.segue.boston.ma.us (John R. Levine) Organization: Segue Software, Cambridge MA Lines: 59 In article <58372@lanl.gov> jlg@lanl.gov (Jim Giles) writes: >> One of the most important aspects of ALGOL is the grammar on which >> is was based. Context free grammars have since been almost universal >> for high level languages, ... >Context free means something different than you are using it for here. >What you are talking about (aparently) is free-form syntax (which is >a mixed bag - at least not all of the aspects of it as ALGOL defined >them are a good idea). Context free has to do with the formal specification >of the syntax - Fortran is context free (in fact, it's LR(k) - it _would_ >be LR(1) if blanks had been significant). Algol 60 was the first language to be defined with a formal syntax, specified in what has come to be called BNF. The syntax of Fortran was given informally in the manual. Parsing Fortran is quite context dependent (I know, because I've written some real Fortran parsers.) Since you have to ignore blanks, at least until F90 becomes widely accepted, the token boundaries depend entirely on the context, e.g. 10E5 is sometimes a floating point number, and sometimes not, as in these: DO 10E5 = 10E5 DO 10E5 = 1, 10 The first is an assignment to DO10E5, and the second is a DO loop using E5 as a variable name. If your Fortran allows long variable names, this is usually the first line of a function: REAL FUNCTION X(N) unless it is preceded by a PARAMETER statement that defines N, in which case it's a declaration of the array FUNCTIONX. Even if you made spaces significant and token boundaries were obvious, Fortran appears to me not to be LR(k) for any finite k since you can write this sort of thing: FORMAT(I1,I2,I3,I4,I5) = 3 and until you see the = sign, you don't know whether it's a statement function or a format statement. (A limit on K is provided by the maximum length of a statement, but that's a pretty cheesy out.) My goal here is not to dump on Fortran, its syntactic peculiarities are well known and entirely tractable, albeit not by conventional tools like lex and yacc without a tremendous amount of kludgery. The point is that formally specifying Algol was a big step forward, both since formal specification languages are the cornerstone of every modern compiler writing system, and because it is my observation that context free languages make it a lot harder accidentally to write a program in which small typos make it mean something entirely different from what it looks like it means. (Of course, just because Algol has a context-free easy-to-tokenize syntax doesn't mean that actual implementations all do. There was IBM's Algol F, which ignored blanks and required you to put quotes around your keywords, proving once again that Real programmers can write Fortran programs in any language. -- John R. Levine, Segue Software, POB 349, Cambridge MA 02238, +1 617 864 9650 johnl@esegue.segue.boston.ma.us, {ima|lotus|spdcc}!esegue!johnl Marlon Brando and Doris Day were born on the same day.