Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Path: utzoo!mnetor!seismo!brl-adm!brl-smoke!smoke!gwyn@BRL.ARPA
From: gwyn@BRL.ARPA (VLD/VMB)
Newsgroups: net.lang.c
Subject: Re:  uses of void
Message-ID: <3212@brl-smoke.ARPA>
Date: Thu, 21-Aug-86 13:11:49 EDT
Article-I.D.: brl-smok.3212
Posted: Thu Aug 21 13:11:49 1986
Date-Received: Thu, 21-Aug-86 22:21:50 EDT
Sender: news@brl-smoke.ARPA
Lines: 55

The idea is not to insist that all C implementations support 16-bit
character codes, but to permit that as a deliberate implementation
decision.  I would hope that the systems I use continue to have
8-bit chars, although I do have to say that quite often I need
to know more about a "letter" than just its class name ("A").  For
example, these days I work a lot with typesetting, bitmap graphics,
and so forth, and sometimes the size and font style of a character
are about as important as its class name.

Of course, it is possible to come up with any number of kludges to
cope with non-traditional DP ideas of characters, and many people
have done so.  The desire, I think, in specifying the C language is
to not have such kludges intrude into implementations where they
are neither wanted nor needed.  As an example of the problems, the
AT&T 16-bit proposal requires strcpy() to handle "escapes", whereas
what one really ought to insist on is that strcpy() handles chars as
in the following simple semantic definition of its function:

char *strcpy( char *dest, char *src )
{
	char	*retval = dest;

	while ( (*dest++ = *src++) != '\0' )
		;

	return retval;
}

If one adopts a kludge approach, then either strcpy() can no longer
be used to copy a string of text characters, or strcpy() no longer
has such a simple implementation.  If "char" is able to hold 16 bits,
then the semantics of strcpy() can continue to be the simple model
shown above, and strcpy() can copy strings of text characters (this
assumes that one would always use 16 bits per character, even if the
7-bit ASCII code subset could be used for some of them).

The worst example I have heard so far about international character
set kludgery is the assertion that strcmp() should be useful for
sorting native-language text into "dictionary order".  Anyone who
knows much about dictionaries (particularly oriental ones) should
appreciate how na"ive that approach is.

One vendor would really like to see a requirement to support in
effect multiple 8-bit translation tables, because that vendor has
already taken that particular approach and would have a competitive
edge if it were made mandatory.  I've found that most major UNIX
system vendors are currently grappling with internationalization
issues, and all have taken different approaches.  So far they all
smack of kludgery to me, although some are better than others.

What concerns me is, if some simple, clean sufficient solution to the
extended code set issue is not made part of the official C language
specification, demands from ISO will lead to an officially-required
kludge approach that will adversely impact even simple ASCII-based
implementations.