Path: utzoo!attcan!uunet!husc6!spdcc!ima!haddock!karl
From: karl@haddock.ISC.COM (Karl Heuer)
Newsgroups: comp.std.c
Subject: The \c escape
Message-ID: <4604@haddock.ISC.COM>
Date: 17 Jun 88 16:51:59 GMT
Reply-To: karl@haddock.isc.com (Karl Heuer)
Organization: Interactive Systems, Boston
Lines: 76
Approved: karl@haddock.isc.com

Enclosed is the text of a proposal I sent in for the second public review.  I
have yet to receive the official reply, but I hear it's been rejected on the
grounds of limited utility.  I'd like to solicit further opinions before I
write up a rebuttal.

Don't bother to argue whether the correct name should be `\c' or `\z' or `\ ';
the question is whether the feature should exist at all.

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint
________

Proposal #1

Add new escape sequence \c.

Summary

This proposal cleans up two warts in the language: initializing a character
array without adding a null character, and terminating a hexadecimal escape
which might be followed by a valid hexadecimal digit.  It also allows the user
to explicitly document when a null character is unnecessary, e.g.
write(1,"\n\c",1).

Justification

I presume the Committee is already aware of the need for non-null-terminated
character arrays, since the January Draft makes a special case for them in
3.5.7.  However, the mechanism requires the user to count the characters
himself in order to make sure that he doesn't leave room for the null
characters; this is a maintenance nightmare.  My proposal is a cleaner way to
accomplish this.

It has been suggested that although an escape to suppress the null character
is useful, the termination of hex escapes is not an issue because it is
handled by string literal pasting.

String pasting is useful for line continuation without backslash-newline, and
for constructing string literals in macros, but using it to indicate the end
of a hex escape is a botch.  This is nearly as bad as suggesting that the
whole string be written in hex.

Moreover, it's very C-specific; one could not advertise a program that
`accepts all the C escapes' as input, without first solving the hex-
termination problem all over again.

Also, it doesn't handle character constants.  The example in 3.1.3.4 is
clearly a kludge--it suggests replacing the hex escape with octal.  This won't
always be possible on an architecture with 12-bit bytes, for example.

Finally, if the \c escape is added anyway for the null-suppression feature,
the additional change of insisting that it be a no-op in other contexts is
minor.

Specific changes

In 3.1.3.4, page 29, line 10, add \c to the list of escapes.  Add the
description: `The \c escape at the end of a string literal suppresses the
trailing null character that would normally be appended.  If \c appears in a
character constant, or anywhere in a string literal other than at the end,
then it is ignored, but may serve to separate an octal or hexadecimal escape
from a following digit.'

In 3.1.3.4, page 30, line 35, change '\0223' to '\x12\c3'.

In 3.1.4, page 31, line 29, after `A null character is then appended' add
`unless the string literal ended with \c'.  Make a similar change to line 31.
Add the sentence `If a character string literal or a wide string literal has
zero length, the behavior is undefined'.  Add to footnote 16 the text `or it
may lack a trailing null character because of \c'.

In 3.1.4, page 31, line 41, add `This string may also be denoted by
"\x12\c3"'.

In 3.5.7, page 73, line 23, replace `if there is room or if the array is of
unknown size' with `if it has one'.  (The ability to initialize a non-null-
terminated array without using \c may be listed as a Common Extension.)