Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Posting-Version: version B 2.10.1 6/24/83 (MC840302); site kuling.UUCP
Path: utzoo!linus!philabs!cmcl2!seismo!mcvax!enea!kuling!andersa
From: andersa@kuling.UUCP (Anders Andersson)
Newsgroups: net.nlang,net.text
Subject: Re: about diacritical marks (danish dynamite)
Message-ID: <777@kuling.UUCP>
Date: Fri, 2-Aug-85 06:31:20 EDT
Article-I.D.: kuling.777
Posted: Fri Aug  2 06:31:20 1985
Date-Received: Sun, 4-Aug-85 07:18:58 EDT
References: <1065@diku.UUCP> <763@mcvax.UUCP> <1070@diku.UUCP> <775@mcvax.UUCP> <1087@diku.UUCP>
Reply-To: andersa@kuling.UUCP (Anders Andersson)
Organization: Uppsala University, Sweden
Lines: 63
Xref: linus net.nlang:3149 net.text:472

AE This goes to both net.text and net.nlang, and currently I think this  AA
AE cross-posting is appropriate. Probably it won't be at a later time... AA

In article <1087@diku.UUCP> storm@diku.UUCP (Kim Fabricius Storm) writes:
>According to my knowledge ae /o and oa (marked with ^^ in the extract) are NOT
>an 'a-e ligature', an 'o with a crossbar' or an 'a with a circle above' - they
>are genuine letters in the danish alphabet:
>   A B C D E F G H I J K L M N O P Q R S T U V (W) X Y Z AE /O oA

I was about to bring up almost the same question. However, I wasn't sure
whether this part of the problem is within the scope of the discussion,
and I don't think anyone has actually claimed that these "ligatures",
"umlauts" etc. really are less important versions of other characters in
all languages concerned. So far only their visual representation has been
considered.

Just for anyone's information, here is the Swedish alphabet also:
   A B C D E F G H I J K L M N O P Q R S T U V (W) X Y Z oA "A "O
Note the different ordering in the end. The same for Finnish I guess,
except that they don't have oA.

>in danish W only occurs in a few personal names and in foreign words, and in
>most dictionaries it is treated just as if it was a V.

The same in Swedish. However, I think 'E and "U should be mentioned
together with W, as they also show up sometimes in personal names.
'E is treated like E, and "U like Y.

If our intention is to create some digital representation of European
written language in a wider sense, and not just those funny graphical
things, then we have to look into this problem as well, yes. And when
we refer to text formatters, this will most likely be the case, I guess...

1. Different alphabets simply sort their letters differently.

2. Various languages put different "value" in the letters they use. In
   the Scandinavian languages, "A and "O (or their correspondants) are
   "real" letters, while in German they are not, just "umlauts". This
   might effect sorting, in that they are "treated as" some other letters.
   I would be glad to receieve some Frenchman's veiw on their myriad of
   accents!

3. There might be slight differences in the printed representation.
   In handwriting, I might use tilde (~) instead of double-dot (")
   over A and O, but when I started writing in German, my teacher
   pointed out that I should not use tilde on the "umlauts".

4. Try to define an international case conversion function when there are
   only two representations of I (with and without dot). "International"
   means that it should work properly in both Paris and Ankara.

I don't think we can count on that a single text is "written" in one
language only, and thus make general assumptions on how to treat the
letters. For instance, when sorting a list of personal names: Should
G"unther be put before or after Gustaf? Or just think of a world atlas!

Note to the eventual implementors: Please reserve some place where we
could later put an escape sequence to switch over to an entirely
different alphabet -- soon we will want to write in Greek, Hebrew or
even Bulgarian...

   Anders Andersson
   ...!seismo!mcvax!enea!kuling!andersa