Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10.1 6/24/83; site decvax.UUCP Path: utzoo!linus!decvax!minow From: minow@decvax.UUCP (Martin Minow) Newsgroups: net.lang.c,net.unix-wizards Subject: C and national character sets Message-ID: <60@decvax.UUCP> Date: Wed, 29-Aug-84 21:41:27 EDT Article-I.D.: decvax.60 Posted: Wed Aug 29 21:41:27 1984 Date-Received: Thu, 30-Aug-84 12:39:11 EDT References: <265@diku.UUCP> Organization: DEC UNIX Engineering Group Lines: 34 Keld J|rn Simonsen brings up an important point concerning C and its standardization. (By the way, the | is the oe ligature character, needed in the Scandinavian languages as well as German.) He notes that several characters used by C are reserved by ISO standards for "national replacement characters" The reserved characters are #@[\]^_`{|}~ -- most of which are used in some way by C. There isn't any really good solution -- it is highly unlikely that the C standardization committee will remove these characters from the language. While most of them can be replaced by suitable #defines, several cannot, notably backslash. The only short-term solution would be for the parties affected to write NRC-specific pre-processors. In the long term, however, the problem will go away as people move to an 8-bit character set such as Dec-Multinational or the pending ISO standard that is almost identical to it. In this standard, the characters in the range 0-128 are identical to the U.S. ASCII 7-bit standard. Characters in the range 128-159 are used for additional controls, and 160-255 for additional graphics. It is actually possible -- though rather messy -- to intermix NRC's and Multinational, allowing Standard C to be written from a terminal that normally displays a non-English NRC set. Unfortunately, this will require a pre-processor that understands the character-set switching escape sequences. This could be done as a Unix filter, of course. Hope this helps. Hej s} l{nge. Martin Minow decvax!minow