Path: utzoo!censor!geac!torsqnt!lethe!yunexus!ists!helios.physics.utoronto.ca!news-server.csri.toronto.edu!cs.utexas.edu!sun-barr!newstop!exodus!cairo.Eng.Sun.COM!tut
From: tut@cairo.Eng.Sun.COM (Bill "Bill" Tuthill)
Newsgroups: comp.text
Subject: Re: Polyglot List Issue
Keywords: character sets
Message-ID: <6273@exodus.Eng.Sun.COM>
Date: 18 Jan 91 22:36:03 GMT
References: <6600@alpha.cam.nist.gov>
Sender: news@exodus.Eng.Sun.COM
Lines: 40

koontz@cam.nist.gov (John E. Koontz X5180) forwards POLYGLOT:
			(Please forward this to POLYGLOT.)

> ------------------------------
> From:    Tom McFarland <tommc@hpcvlx.cv.hp.com>
> 
> ISO 10646 is standard being developed by official national representatives;
> Unicode is a grass roots based, competing code set being proposed by a
> group of vendors.

Proposals are currently underway to make Unicode one of the accepted
code planes in 10646, and to create a 10646U (U for Unicode) compaction
form using only 16 bits, rather than the 32 bits required by 10646.
The Unicode Consortium has no wish to compete with ISO 10646, and would
prefer to work with ISO toward a truly useful standard.

> there is not a one-to-one mapping and data may be lost
> converting between the two.

Mainly for the reason that Unicode includes many languages that 10646
does not represent.  Both Unicode and 10646 fully represent all ISO 8859
and all existing Chinese/Japanese/Korean national standards.

I believe that C/J/K unification is the right thing to do.  Consider
what the world would be like if English-speaking people insisted on
having their own A-Za-z alphabet, separate from Spanish A-Za-z.  This
is exactly that East Asian countries are doing.

> ISO 10646 is very specific in the forms of use allowed.  One key
> difference that comes to mind is that ISO prohibits assigning
> character to row/column/plane/group values in the range 0x00-0x20,
> 0x7f-0xA0, and 0xff.  ISO has done this in an attempt to maintain some
> level of backwards compatibility with hardware/software that recognize
> these values as control codes.  Unicode actively uses these values to
> achieve its compactness.

Unicode does leave empty slots for ASCII and ISO 8859 control codes.
It's that sufficient?  I don't understand the purpose of leaving any
more empty slots than those.  Perhaps someone knowledgeable about ISO
10646 could enlighten me.