Path: utzoo!attcan!uunet!mcsun!ukc!axion!masalla.fulcrum.bt.co.uk!beta.its.bt.co.uk!tjo From: tjo@its.bt.co.uk (Tim Oldham) Newsgroups: comp.lang.c Subject: Re: Re^2: Why nested comments not allowed? Message-ID: Date: 20 Feb 90 17:26:57 GMT References: <236100027@prism> <1414@amethyst.math.arizona.edu> <1523@wacsvax.OZ> <4320@daffy.cs.wisc.edu> Sender: igb@fulcrum.bt.co.uk (Ian G Batten) Organization: BT Applied Systems, Birmingham, UK Lines: 27 In article <4320@daffy.cs.wisc.edu> schaut@cat9.cs.wisc.edu (Rick Schaut) writes: > >I think you've missed the point. In compilers for languages that do not >allow nested comments the parser never see the comment at all. The comments >are eaten by the scanner (which is a much simpler part of the compiler than >is a parser). Essentially, any language that requires balancing characters >(e.g. the language of balanced parens) cannot be represented using regular >expressions, and regular expressions are the construct upon which scanners >are based. In short, a compiler for a language that doesn't allow nested >comments is _much_ faster than a compiler for a language that allows them. In a Modula-2 interpreter I was once involved with, the scanner simply matched the first start_of_comment and then called eat_nested_comment(), and extremely simple and fast recursive routine. This also allowed the start of the comment to be printed if an error occurred ie unmatching close-comments. As you're throwing away everything in between the beginning and end of the outermost comments, it's a completely different category of problem. The parser never sees any of the comment. In fact, you can guarantee that the main scanner token-matcher will only ever see an outermost start_of_comment. You never generate any tokens for the parser. Tim. -- Tim Oldham, BT Applied Systems. tjo@its.bt.co.uk or ...!ukc!axion!its!tjo ``Asking questions is the best way to get answers.'' --- Philip Marlowe.