Path: utzoo!attcan!uunet!samsung!think!snorkelwacker!bloom-beacon!eru!luth!sunic!enea!sommar From: sommar@enea.se (Erland Sommarskog) Newsgroups: comp.std.internat Subject: Re: ISO standards for non-Latin alphabets Message-ID: <501@enea.se> Date: 25 Nov 89 22:42:15 GMT Organization: Enea Data AB, Sweden Lines: 41 David J. Birnbaum (djb@wjh12.UUCP) criticized ISO 8859/5 in a long article in this newsgroup. I'm inclined to agree with him on many points. I wouldn't say I completely satisfied with the conecpt of Latin-1, Latin-2 etc. As a covering standard 6937 seems much more appealing. However, 8859 is here to stay for a while, and I think it's just to accept it as it is. After all 8859 is a lot better than ASCII alone. Mr. Birnbaum mainly focuses at the cyrillic set, but many of the problems he discusses concerns the latin sets as well. I will only cover one of the here, the one of collation order. > Character Order > > One advantage to following alphabetic order in character >coding is that it enables alphabetic sorting by comparing strings >according to machine order. This type of unfiltered sorting in >8859/5 is impossible for Ukrainian, Belorussian, Serbocroatian, >or Macedonian, since the characters from columns 10 and 15 would >have to be inserted into their proper places. This is a com- >pletely unnecessary limitation, because with one minor excep- >tion(15) all modern Slavic languages that use the Cyrillic al- >phabet follow a single order. Not all characters will occur in >each language, but a single order for the entire character set >would have made it possible to sort all languages in machine or- >der.(16) The truth is that a single enumeration doesn't apply at all for many languages. Dotted "A" and dotted "O" are separate letters in Swedish, but in German they are to be co-sorted with "A" and "O" or as "AE" and "OE". Same goes for accented letters in many languages. The conclusion of this is that software sorting packages are needed that can be customized to the desired with common languages pre- defined. Given this, it doesn't feel very important that the cyrrilic language would be honored a particular order. As some other poster, I think it was Tor Lillqvist, said, the best would have been if ASCII had taken the letter in random order. -- Erland Sommarskog - ENEA Data, Stockholm - sommar@enea.se