Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: Notesfiles $Revision: 1.6.2.16 $; site ima.UUCP Path: utzoo!linus!decvax!yale!ima!johnl From: johnl@ima.UUCP Newsgroups: net.internat Subject: Alphabetical Order Message-ID: <125100001@ima.UUCP> Date: Mon, 7-Oct-85 23:25:00 EDT Article-I.D.: ima.125100001 Posted: Mon Oct 7 23:25:00 1985 Date-Received: Wed, 9-Oct-85 06:21:39 EDT Lines: 33 Nf-ID: #N:ima:125100001:000:1690 Nf-From: ima!johnl Oct 7 23:25:00 1985 Let's talk for a minute or two about putting strings in alphabetical order. Here, as I understand it, are some of the problems involved: -- Character set. The codes used for characters not found in the English alphabet are not well standardized. Some people reassign the "national option" characters which in the U.S. are things like curly braces. Some, like the IBM PC crowd, try to define an 8-bit character set. Some, like the Teletex crowd, define multi-byte sequences for characters with accents. I have no idea what happens to characters like the Icelandic eth and thorn which are not created by adding an accent to an English letter. -- Upper vs. lower case. The mapping between upper and lower case is quite language specific. Some languages are quite strict about mapping between corresponding accented upper and lower case, while others (French, notably) are pretty casual about their upper case accented letters. I gather that there are languages with lower case letters that have no upper case equivalent. -- Digraphs. Many languages have character pairs which, for the purpose of alphabetization, are treated as one letter, such as Spanish "ll". -- Alphabet order. Some languages sort accented letters in next to their unaccented versions. Others put them at the end of the alphabet or otherwise scramble them around. Anything else important I've left out? John Levine, Javelin Software, Cambridge MA 617-494-1400 { decvax!cca | think | ihnp4 | cbosgd }!ima!johnl, Levine@YALE.ARPA The opinions above are solely those of a 12 year old hacker who has broken into my account, and not those of my employer or any other organization.