Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!mnetor!genat!maccs!gordan From: gordan@maccs.UUCP (Gordan Palameta) Newsgroups: comp.std.internat Subject: Re: Character representation Message-ID: <709@maccs.UUCP> Date: Wed, 12-Aug-87 22:25:05 EDT Article-I.D.: maccs.709 Posted: Wed Aug 12 22:25:05 1987 Date-Received: Sat, 15-Aug-87 05:27:18 EDT References: <2171@enea.UUCP> Reply-To: gordan@maccs.UUCP (Gordan Palameta) Organization: DCSS, McMaster University, Hamilton, Ontario, Canada Lines: 28 In article <2171@enea.UUCP> sommar@enea.UUCP(Erland Sommarskog) writes: > >value. But isn't a character a more complicated data type >than just a simple enumeration type? In some languages the >combination may constitute a new letter ("a" with ring and dots, >"o" with dots in Swedish), in other you can apply accents and >other signs without affecting the sorting. (E.g. French, Italian) > I think that the simple represenatation for charcters is completely >due the dominating position of the English language in the computer >world. If computers had been invented in France the problem would >have been solved. (And if they had been Swedish, Englishmen would It gets even more complicated: in Spanish, I believe, ch is considered a separate letter, between c and d in alphabetical order (likewise with ll). It only goes to show that alphabetical order is language-dependent, and identical strings will sort differently depending on locale. The only general solution is to have intelligent operating system routines to handle sorting. Despite 7-bit ASCII, which makes possible code such as if (c >= 'A' && c <= 'Z') there is no reason why the numeric representation of a character should have anything to do with the position of that character in a collating sequence. -- UUCP: ... !mnetor!lsuc!maccs!gordan BITNET: GP@TANDEM "Sumasshedshii vsekh stran, soyedinyaites'" Gordan Palameta