Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Path: utzoo!mnetor!genat!maccs!gordan
From: gordan@maccs.UUCP (Gordan Palameta)
Newsgroups: comp.std.internat
Subject: Re: Character representation
Message-ID: <709@maccs.UUCP>
Date: Wed, 12-Aug-87 22:25:05 EDT
Article-I.D.: maccs.709
Posted: Wed Aug 12 22:25:05 1987
Date-Received: Sat, 15-Aug-87 05:27:18 EDT
References: <2171@enea.UUCP>
Reply-To: gordan@maccs.UUCP (Gordan Palameta)
Organization: DCSS, McMaster University, Hamilton, Ontario, Canada
Lines: 28

In article <2171@enea.UUCP> sommar@enea.UUCP(Erland Sommarskog) writes:
>
>value. But isn't a character a more complicated data type
>than just a simple enumeration type? In some languages the
>combination may constitute a new letter ("a" with ring and dots,
>"o" with dots in Swedish), in other you can apply accents and
>other signs without affecting the sorting. (E.g. French, Italian)
>  I think that the simple represenatation for charcters is completely
>due the dominating position of the English language in the computer
>world. If computers had been invented in France the problem would
>have been solved. (And if they had been Swedish, Englishmen would

It gets even more complicated:  in Spanish, I believe, ch is considered
a separate letter, between c and d in alphabetical order (likewise with ll).

It only goes to show that alphabetical order is language-dependent, and
identical strings will sort differently depending on locale.  The only
general solution is to have intelligent operating system routines to
handle sorting.

Despite 7-bit ASCII, which makes possible code such as
   if (c >= 'A' && c <= 'Z')
there is no reason why the numeric representation of a character should have
anything to do with the position of that character in a collating sequence.

-- 
UUCP:  ... !mnetor!lsuc!maccs!gordan              BITNET: GP@TANDEM
"Sumasshedshii vsekh stran, soyedinyaites'"        Gordan Palameta