Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10.1 6/24/83 (MC840302); site kuling.UUCP Path: utzoo!linus!philabs!cmcl2!seismo!mcvax!enea!kuling!andersa From: andersa@kuling.UUCP (Anders Andersson) Newsgroups: net.nlang,net.text Subject: Re: about diacritical marks (danish dynamite) Message-ID: <777@kuling.UUCP> Date: Fri, 2-Aug-85 06:31:20 EDT Article-I.D.: kuling.777 Posted: Fri Aug 2 06:31:20 1985 Date-Received: Sun, 4-Aug-85 07:18:58 EDT References: <1065@diku.UUCP> <763@mcvax.UUCP> <1070@diku.UUCP> <775@mcvax.UUCP> <1087@diku.UUCP> Reply-To: andersa@kuling.UUCP (Anders Andersson) Organization: Uppsala University, Sweden Lines: 63 Xref: linus net.nlang:3149 net.text:472 AE This goes to both net.text and net.nlang, and currently I think this AA AE cross-posting is appropriate. Probably it won't be at a later time... AA In article <1087@diku.UUCP> storm@diku.UUCP (Kim Fabricius Storm) writes: >According to my knowledge ae /o and oa (marked with ^^ in the extract) are NOT >an 'a-e ligature', an 'o with a crossbar' or an 'a with a circle above' - they >are genuine letters in the danish alphabet: > A B C D E F G H I J K L M N O P Q R S T U V (W) X Y Z AE /O oA I was about to bring up almost the same question. However, I wasn't sure whether this part of the problem is within the scope of the discussion, and I don't think anyone has actually claimed that these "ligatures", "umlauts" etc. really are less important versions of other characters in all languages concerned. So far only their visual representation has been considered. Just for anyone's information, here is the Swedish alphabet also: A B C D E F G H I J K L M N O P Q R S T U V (W) X Y Z oA "A "O Note the different ordering in the end. The same for Finnish I guess, except that they don't have oA. >in danish W only occurs in a few personal names and in foreign words, and in >most dictionaries it is treated just as if it was a V. The same in Swedish. However, I think 'E and "U should be mentioned together with W, as they also show up sometimes in personal names. 'E is treated like E, and "U like Y. If our intention is to create some digital representation of European written language in a wider sense, and not just those funny graphical things, then we have to look into this problem as well, yes. And when we refer to text formatters, this will most likely be the case, I guess... 1. Different alphabets simply sort their letters differently. 2. Various languages put different "value" in the letters they use. In the Scandinavian languages, "A and "O (or their correspondants) are "real" letters, while in German they are not, just "umlauts". This might effect sorting, in that they are "treated as" some other letters. I would be glad to receieve some Frenchman's veiw on their myriad of accents! 3. There might be slight differences in the printed representation. In handwriting, I might use tilde (~) instead of double-dot (") over A and O, but when I started writing in German, my teacher pointed out that I should not use tilde on the "umlauts". 4. Try to define an international case conversion function when there are only two representations of I (with and without dot). "International" means that it should work properly in both Paris and Ankara. I don't think we can count on that a single text is "written" in one language only, and thus make general assumptions on how to treat the letters. For instance, when sorting a list of personal names: Should G"unther be put before or after Gustaf? Or just think of a world atlas! Note to the eventual implementors: Please reserve some place where we could later put an escape sequence to switch over to an entirely different alphabet -- soon we will want to write in Greek, Hebrew or even Bulgarian... Anders Andersson ...!seismo!mcvax!enea!kuling!andersa