Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!mnetor!uunet!husc6!mit-eddie!uw-beaver!tektronix!teklds!athena!scottha From: scottha@athena.TEK.COM (Scott Hankerson) Newsgroups: sci.lang,comp.std.internat Subject: Re: Character representation Message-ID: <1583@athena.TEK.COM> Date: Wed, 19-Aug-87 12:30:08 EDT Article-I.D.: athena.1583 Posted: Wed Aug 19 12:30:08 1987 Date-Received: Sat, 22-Aug-87 11:07:12 EDT References: <15381@mordor.s1.gov> Reply-To: scottha@athena.UUCP (Scott Hankerson) Organization: Tektronix, Inc., Beaverton, OR. Lines: 32 Xref: mnetor sci.lang:1181 comp.std.internat:136 In article <15381@mordor.s1.gov> pom@s1-under.UUCP () writes: >>>But this doesn't >>>address all problems I mentioned. How to construct a general character >>>with an arbitrary accent, umlaut or other diacritic mark? An 8-bit >>>enumarate isn't sufficient. > > The problem you (somebody) mentioned is hereby addressed. > To disprove my conjecture, name one language with Latin-based alphabet > and one letter in that alphabet, which admits more then one modifier. In French, one can have up to four different modifiers over a particular letter (the letter e can have a grave accent, one going the other direction (I never can remember what they're called in English (')), a circumflex, or an umlaut. In addition, I may want to quote from other languages if I write in French. > Oh, just BTW - using poor ASCII, which has no modifier bit, I am > using the convention that modifier is indictaed by h ( e.g. > a word: (modified_s)ot would appear as shot. (which is quite wastefull > as whole h is needed to perform function of one bit). Surely this would introduce even more ambiguities. In German, an h lengthens the vowels. Is a vowel followed by an h an umlauted vowel? An umauted vowel followed by an h? Or simply a vowel followed by an h? I haven't seen anyone mention an ISO standard yet. I was under the impression that there was one. Am I wrong? I don't much care for the alternates that I have seen used by terminal manufacturers in the US which is a keyboard with many of the special symbols replaced with accented characters. That may be nice for writting documents, but it must be intollerable for coding in C or any other programming language which uses many nonalphabetic symbols.