Path: utzoo!attcan!uunet!lll-winken!lll-lcc!ames!umd5!brl-adm!brl-smoke!gwyn
From: gwyn@brl-smoke.ARPA (Doug Gwyn )
Newsgroups: comp.lang.c
Subject: Re: trigraphs in X3J11
Message-ID: <7969@brl-smoke.ARPA>
Date: 26 May 88 17:11:57 GMT
References: <5215@ico.ISC.COM> <7937@brl-smoke.ARPA> <5424@ico.ISC.COM>
Reply-To: gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>)
Organization: Ballistic Research Lab (BRL), APG, MD.
Lines: 74

In article <5424@ico.ISC.COM> rcd@ico.ISC.COM (Dick Dunn) writes:
>Thanks to Doug Gwyn for some answers on trigraphs.  Unfortunately, the more
>I learn, the less I like them...but that's not Doug's fault.

Thanks for recognizing that I don't like them either and am just trying
to explain what I think X3J11's motivation/reasoning was.  Of course,
I'm not speaking officially for X3J11 here and may have gotten this
wrong (the original decision was made before I started attending meetings).

One thing to keep in mind is that almost everyone agrees that it is
important for the ANSI and ISO standards for C to be technically
identical.  Therefore X3J11 is dealing with internationalization
issues, even though this might seem unnecessary for ANSI purposes.

>| Avoid "quiet changes."

The proposed ANSI/ISO C standard introduces several "quiet changes",
as noted in the Rationale document.  Certainly one guideline was to
minimize these, but there were many guidelines and they conflicted
to some degree.  Therefore compromises had to be worked out; if it
makes you feel better, call these "optimal solutions to constrained
problems" instead of "compromises".

>There are real examples of code currently in use which will be "broken"
>if recompiled by a compiler conforming to this part of the draft standard.

Yes, that's true for all "quiet changes".

>I have a philosophical view that this problem would be better off with
>no solution than with a clumsy solution that breaks existing code.

I don't think "no solution" was considered acceptable to ISO at the
time.

>... R"stuff??/n" would mean "stuff\n".

This is not a bad idea, but as the proposed standard stands trigraphs are
mapped well before anything else is done to analyze the source code, so
the "??/" would not hang around long enough for this method to be applied.
If it weren't for the need to deal with {} etc. then trigraph mapping
could possibly be deferred, but the main use of trigraphs is for {} etc.
so the mapping cannot be deferred long enough.

>What about an ISO 8859 character set?  Wouldn't that cover a lot of the
>problem area?

It was considered inappropriate for the C standard to constrain the choice
of character set like that.  However, it recently was revised to promise
that '0' through '9' have ascending numerical representations, and of
course it does require that a large set of characters be representable,
so there is some precedent.  I doubt that enough vendors would support
such a requirement, though.

The ISO 646-1983 invariant code set was taken as the least common
denominator for respresentable character glyphs.  I think that was the
real mistake; glyphs are just silly marks on paper or displays, and we
aren't really interested in their shapes other than that all of the ones
we need for C be unique.  I don't much care if { sometimes looks like [[
or \(lb, so long as I have tools for dealing with it when I program.

>What do Europeans do about C now?

The only existing practice I had heard about was use of (< >) etc.,
details varying from place to place.  Perhaps some Europeans can
contribute more info here.

>Then I wish folks had pushed against them harder.

Two factors conspired here.  One is that many existing
environments don't offer much support for better source code
import/export/printing translation, which is how I think this
issue should be dealt with.  The other is that "ISO insisted on
this sort of solution", which may or may not be true but it
certainly makes it hard to deal with since X3J11 and the ISO C
people don't meet concurrently.