Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!mnetor!genat!maccs!gordan From: gordan@maccs.UUCP (Gordan Palameta) Newsgroups: comp.std.internat Subject: Re: Character representation Message-ID: <719@maccs.UUCP> Date: Sun, 16-Aug-87 22:48:41 EDT Article-I.D.: maccs.719 Posted: Sun Aug 16 22:48:41 1987 Date-Received: Mon, 17-Aug-87 02:36:16 EDT References: <2171@enea.UUCP> <709@maccs.UUCP> <2183@enea.UUCP> Reply-To: gordan@maccs.UUCP (Gordan Palameta) Organization: DCSS, McMaster University, Hamilton, Ontario, Canada Lines: 43 In article <2183@enea.UUCP> sommar@enea.UUCP(Erland Sommarskog) writes: >In a recent article gordan@maccs.UUCP (Gordan Palameta) writes: >>In article <2171@enea.UUCP> sommar@enea.UUCP(Erland Sommarskog) writes: > >But this doesn't >address all problems I mentioned. How to construct a general character >with an arbitrary accent, umlaut or other diacritic mark? An 8-bit >enumarate isn't sufficient. t umlaut or q cedilla would probably be used very rarely, nor is it likely that anyone would go to the trouble of designing a font to accomodate such characters. Another cost of such generality would be that accents and other marks would probably have to be indicated by escape sequences in conjunction with the unmodified letter. This would make string-processing software more complicated (and slower), and text would be longer. >>Despite 7-bit ASCII, which makes possible code such as >> if (c >= 'A' && c <= 'Z') >>there is no reason why the numeric representation of a character should have >>anything to do with the position of that character in a collating sequence. > >Right, but almost all programming today depends on it, isn't it so? >It's easier to implement and executes faster. Not at all, just define a 256-byte lookup table in an include file, and modify the code to if (coll[c] >= FIRST_CHAR && coll[c] <= LAST_CHAR) with very little loss of efficiency. To accomodate perverse languages like Spanish and Polish which insist on two-letter combinations for sorting, this won't do however: change the square brackets to round ones (with some loss of efficiency, but very little inconvenience in coding). (well some inconvenience, c can't be a single character any more). Never mind the French; what if things had turned out differently in 1588 with the Armada, and the Spanish had invented computers? Or the Chinese? Followups to alt.universes. -- UUCP: ... !mnetor!lsuc!maccs!gordan BITNET: GP@TANDEM "Sumasshedshii vsekh stran, soyedinyaites'" Gordan Palameta