Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!cornell!uw-beaver!uw-june!uw-entropy!dataio!bright
From: bright@Data-IO.COM (Walter Bright)
Newsgroups: comp.lang.c
Subject: Re: Why are character arrays special (e
Message-ID: <1879@dataio.Data-IO.COM>
Date: 15 Feb 89 21:36:16 GMT
References: <19742@uflorida.cis.ufl.EDU> <225800126@uxe.cso.uiuc.edu> <1989Feb10.191041.12109@utzoo.uucp> <1875@dataio.Data-IO.COM> <1989Feb14.161906.16138@utzoo.uucp>
Reply-To: bright@dataio.Data-IO.COM (Walter Bright)
Organization: Data I/O Corporation; Redmond, WA
Lines: 30

In article <1989Feb14.161906.16138@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes:
>In article <1875@dataio.Data-IO.COM> bright@dataio.Data-IO.COM (Walter Bright) writes:
>A compiler that spends most of its time tokenizing source obviously isn't
>working very hard at code generation.
The optimizer is a separate pass, it's designed that way so the user has the
choice of fastcompile/slowexecute or slowcompile/fastexecute.
BTW, my code generator is very efficient (it's not table driven, it's all
ad-hoc inline stuff, and is heavily optimized).
>>... I've found in my compiler (Zortech) that ONE extra
>>instruction executed per char read slows down the compiler by 5 to 10%.
>Hmm, that's 10-20 instructions per character, with C typically about
>20 chars/line, and with even a decidedly slow machine delivering an
>instruction per microsecond, gives us 2500+ lines/second.
I tested it on my own code, which includes lots of comments, macro defs
and extern defs (all the .h files). Most of the identifiers are
relatively long. The number of lines in the .h files dwarf the number
of lines in the .c files. None of this stuff goes through the
code generator, so my results are different from that of code which
consists mosly of expressions. The ideal is to get the speed of processing
white space, comments, and false conditionals to approximate the speed
of simply reading characters from a file.
>If it's that big a deal, have you considered having a default "no trigraphs"
>mode and a slower "trigraphs" mode?  That way, if nobody uses it, there's
>no impact except for a bit of code that never gets executed.
That's the way I decided to implement it. The main difficulty, however, with
this approach is that magazine C compiler reviewers frequently don't read
the manual, and may simply run the compiler with the default settings, and
wrongly conclude that it doesn't support trigraphs. For example, the
latest BYTE review feature list for Zortech C contains numerous errors, all
resulting from the reviewers not reading the manual.