Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!think.com!yale!cmcl2!adm!smoke!gwyn From: gwyn@smoke.brl.mil (Doug Gwyn) Newsgroups: comp.std.c Subject: Re: How to write Trigraph like character sequences in a string Message-ID: <16368@smoke.brl.mil> Date: 10 Jun 91 02:51:47 GMT References: <676248139.0@ananke.stgt.sub.org> Organization: U.S. Army Ballistic Research Laboratory, APG, MD. Lines: 44 In article <676248139.0@ananke.stgt.sub.org> Andreas.Kaiser@f7014.n244.z2.stgt.sub.org (Andreas Kaiser) writes: > >Actually, one never NEEDS trigraphs; they're required to be > >supported as a convenience when interchanging source code among sites or > >equipment with poor support for the C source character set. >There are character sets, which do not support all C language characters, such >as braces and brackets (example: 7-bit german). While it is usually possible to >use the characters corresponding to these ASCII codes instead (...but almost >unreadable), it is likewise possible that some text file exchange program >silently converts these special characters into PC, ISO or ECMA 8-bit >equivalents. Trigraph characters are available in all roman-style character sets >and will be understood by all machines. I wouldn't promise that codes for the "trigraph characters" are universally available, for example in CDC "display code". However, they were deliberately chosen from the glyphs that are supposed to have corresponding codes in all realizations of ISO 646 (rapidly becoming obsolete). You miss my point, though. The "C source character set" is NOT, repeat NOT, the same as whatever character code set is used for external representation of text characters on a given system. The C source characters are the result of a TRANSLATION from external representation to some form internal to the compiler. In many compiler implementations, this translation uses the same encoding, but that is NOT a requirement of the standard, and the freedom to translate one or more external characters to a C source character can be exploited to nicely support the "difficult" source characters that often have no "offical" external equivalents. In other words, constructs in Courier font in the C standard need not be thought of as unmapped external source code representations, but rather are (a) the internal form of the program after the first part of translation phase 1, or (b) an external form after some translation utility has been applied to the external source code to present it decently on a printer or more than minimal terminal. The printers and terminals I use, for example, all allow downloading of font bitmaps, and so long as there are enough code values, which there always are for 7-bit code sets, I can select those that will be used to represent the glyphs for "vertical bar" etc. If the compiler supports the same conventions (as in fact most existing compilers do, where the USASCII code values are adopted), everything works well. Problems arise only when one's system has neglected to provide decent facilities for input/editing/display of the "difficult" glyphs using the adopted mapping conventions.