Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Posting-Version: version B 2.10.1 6/24/83; site decvax.UUCP
Path: utzoo!watmath!clyde!burl!ulysses!bellcore!decvax!minow
From: minow@decvax.UUCP (Martin Minow)
Newsgroups: net.internat
Subject: ISO Latin 1 alphabet
Message-ID: <163@decvax.UUCP>
Date: Fri, 17-Jan-86 18:34:19 EST
Article-I.D.: decvax.163
Posted: Fri Jan 17 18:34:19 1986
Date-Received: Sun, 19-Jan-86 04:08:03 EST
References: <157@decvax.UUCP> <1166@utai.UUCP>
Reply-To: minow@decvax.UUCP (Martin minow)
Organization: DEC - ULTRIX Engineering Group
Lines: 51

"ISO Latin 1 8-bit alphabet, what is it?" -- these notes are mostly
from memory, and I apologize in advance for any errors.

Latin-1 is intended to replace the current mess of National Replacement
Character Sets (the ones that use any or all of #@[\]^`{|} for letters
that aren't in the US national alphabet that we usually call ASCII).

The alphabet is currently a draft international standard, being developed
by ISO, ANSI, and CBEMA (European Business Equipment Manufacturers).
It is very similar to the "Dec-Multinational" alphabet available
with the VT200-series terminals, and Dec's personal computers.
It suits the needs of the majority of Western European Latin-letter
languages, and there are proposals for "Latin-2" and "Latin-3" to
suit needs of Polish, Lithuanian, etc.

Latin-1 adds accented variants to upper- and lower-case vowels,
as well as a number of other language-specific letters.  There
are also a number of additional symbols.

AEIOU and aeiou are provided in grave, acute, circumflex, and umlaut
variants.  The following letters are also provided:

  A-ring and a-ring (Swedish, Danish, Finnish, Norwegian)
  AE and ae ligatures (Danish)
  A-tilde and a-tilde
  C-cedilla and c-cedilla (French)
  N-tilde and n-tilde (Spanish)
  O-tilde and o-tilde
  O-slash and o-slash (Danish, Norwegian)
  OE and oe ligatures (Danish)
  ss (German sharp-s)
  Y-umlaut and y-umlaut (French, also used for the ij ligature in Dutch)

The above refers only to Dec-Multinatinal.  Latin-1 adds a few more
letters -- I believe these include Icelandic th and dh, and Turkish
undotted-i and dotted-I.

While upper- and lower-case variants of the letters are related in
the same way as "standard" ASCII, the rules to convert between
cases are language-dependent.  For example, lower-case accented letters
generally lose their accents in French, but not in Swedish.

In preparing for Latin-1, you should carefully go over your programs
to remove any instance of "high-bit used for a flag".  Also,
programs such as grep that let you search for "any alphabetic"
or -- worse -- "upper-case" are going to need rethinking.

Hoping the above hasn't been too incorrect,

Martin Minow
decvax!minow