Path: utzoo!utgpu!news-server.csri.toronto.edu!rutgers!cs.utexas.edu!execu!sequoia!uudell!bigtex!texsun!newstop!exodus!cairo.Eng.Sun.COM!tut
From: tut@cairo.Eng.Sun.COM (Bill "Bill" Tuthill)
Newsgroups: comp.text
Subject: Re: International character set requirements needed
Keywords: 8-bit data, mail
Message-ID: <5204@exodus.Eng.Sun.COM>
Date: 3 Jan 91 22:51:24 GMT
References: <1990Dec20.012516.23623@ico.isc.com> <keld.662763012@dkuugin> <keld.662879321@dkuugin>
Sender: news@exodus.Eng.Sun.COM
Lines: 28

keld@login.dkuug.dk (Keld J|rn Simonsen) writes:
> 
> Is UNICODE a true subset of ISO 10646?
> Is there a well defined relation between ISO 10646 encoding and UNICODE?

ISO 10646 is still in draft form.  Both questions are impossible to answer
until 10646 gets finalized.  Disclaimer: I'm not an expert in this area.
However, extrapolating from what I know, it appears that Unicode could be
considered a 16-bit implementation of 10646.  The ISO 10646 draft standard
appears to permit 16-bit implementations of any subset thereof, for use in
process code or communication.  It just so happens that Unicode covers all
Asian characters enumerated by existing national standards, plus characters
from languages that the 10646 draft hasn't even thought about.  So it may
be a subset, but a largely complete subset.

Lee Collins writes:
> Notice that 10646 would require 93,816 separate codes to cover existing
> [Chinese/Japanese/Korean] standards.  Han Unification allows Unicode to
> cover the same standards with only 18,739 unique characters.

Ken Whistler writes:
> Unicode 1.0 also includes the following scripts omitted from DIS 10646:
> Ethiopian, Bengali, Gurmukhi, Gujarati, Oriya, Tamil, Telugu, Kannada,
> Malayalam, Sinhalese, and Lao.

There have been attempts to convert Unicode to 10646 and back again,
I believe with mostly good results.  Of course, some data may be lost
in the translation.