Path: utzoo!utgpu!news-server.csri.toronto.edu!mailrus!ames!think!samsung!cs.utexas.edu!longway!std-unix From: wheeler@ida.org (David Wheeler) Newsgroups: comp.std.unix Subject: Re: Report on WG15 Rapporteur Group Message-ID: <568@longway.TIC.COM> Date: 16 Mar 90 23:35:09 GMT Sender: std-unix@longway.TIC.COM Reply-To: std-unix@uunet.uu.net Lines: 61 Approved: jsq@longway.tic.com (Moderator, John S. Quarterman) From: wheeler@ida.org (David Wheeler) domo@tsa.co.uk (Dominic Dunlop): = From: Dominic Dunlop = = Report on ISO/IEEE JTC1/SC22/WG15 Rapporteur Group on = Internationalization Meeting of 5th - 7th = March, 1990, Copenhagen, Denmark = = Dominic Dunlop -- domo@tsa.co.uk = = The Standard Answer Ltd. = I enjoyed your posting, thank you! You included a lot of "what this phrase really means" that I appreciated. = = 3. ISO 646[4], the earliest ISO standard for information = technology, is the international derivative of ASCII. = Its Danish variant replaces ASCII's } with aa. Around = the world, #$@[\]^`{|}~, all of which have a special = meaning to the shell, are replaced by other characters = in standards derived from ISO 646. See [5] for much = more information. = Isn't there an 8-bit standard character set that defines the first 128 characters as a standard set (say as USASCII, provincial I'm afraid but it would break no Unix tools), then includes all the international characters as those with values > 127? If this were used in the POSIX standard, wouldn't this solve many problems for those using a Latin-based alphabet? Or is this standard unused in the real world? Admittedly this eliminates the non-Latin alphabet world, and that is a weakness. = Apart from all this organizational stuff, we did review some = existing documents. For example, DTR (draft technical = report) 10176, a product of SC14, discusses the treatment of = characters appearing in language constructs, variable names, = literals and comments, and turns out to have implications = for sh, awk, yacc and the other ``little languages'' defined = in DP 9945-2, the forthcoming international standard for the = shell and tools. And a document from SC22's study group on = character sets suggests that source files should have some = means of announcing the character set that they're using. = Could this mean typed files or resource forks for POSIX6? = Gee. How would we hide that? = Some C programs would have to be fixed to deal with signed characters but at least the rules would be simple: 128+ are ordinary characters & can be used in identifiers, etc. Source file tagging for language sounds like an abomination! --- David A. Wheeler wheeler@ida.org Volume-Number: Volume 18, Number 80