Path: utzoo!attcan!uunet!mcsun!ukc!axion!masalla.fulcrum.bt.co.uk!beta.its.bt.co.uk!tjo
From: tjo@its.bt.co.uk (Tim Oldham)
Newsgroups: comp.lang.c
Subject: Re: Re^2: Why nested comments not allowed?
Message-ID: <TC$#8^-@masalla.fulcrum.bt.co.uk>
Date: 20 Feb 90 17:26:57 GMT
References: <236100027@prism> <1414@amethyst.math.arizona.edu> <1523@wacsvax.OZ> <4320@daffy.cs.wisc.edu>
Sender: igb@fulcrum.bt.co.uk (Ian G Batten)
Organization: BT Applied Systems, Birmingham, UK
Lines: 27

In article <4320@daffy.cs.wisc.edu> schaut@cat9.cs.wisc.edu (Rick Schaut) writes:
>
>I think you've missed the point.  In compilers for languages that do not
>allow nested comments the parser never see the comment at all.  The comments
>are eaten by the scanner (which is a much simpler part of the compiler than
>is a parser).  Essentially, any language that requires balancing characters
>(e.g. the language of balanced parens) cannot be represented using regular
>expressions, and regular expressions are the construct upon which scanners
>are based.  In short, a compiler for a language that doesn't allow nested
>comments is _much_ faster than a compiler for a language that allows them.

In a Modula-2 interpreter I was once involved with, the scanner simply
matched the first start_of_comment and then called eat_nested_comment(),
and extremely simple and fast recursive routine. This also allowed the
start of the comment to be printed if an error occurred ie unmatching
close-comments.

As you're throwing away everything in between the beginning and end of
the outermost comments, it's a completely different category of problem.
The parser never sees any of the comment. In fact, you can guarantee
that the main scanner token-matcher will only ever see an outermost
start_of_comment. You never generate any tokens for the parser.

	Tim.
-- 
Tim Oldham, BT Applied Systems. tjo@its.bt.co.uk or ...!ukc!axion!its!tjo
``Asking questions is the best way to get answers.'' --- Philip Marlowe.