Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Path: utzoo!mnetor!seismo!think!ima!haddock!karl
From: karl@haddock
Newsgroups: net.lang.c
Subject: ANSI C
Message-ID: <86900009@haddock>
Date: Tue, 12-Aug-86 20:22:00 EDT
Article-I.D.: haddock.86900009
Posted: Tue Aug 12 20:22:00 1986
Date-Received: Wed, 13-Aug-86 03:18:55 EDT
Lines: 95
Nf-ID: #N:haddock:86900009:000:5440
Nf-From: haddock!karl    Aug 12 20:22:00 1986


ANSI C folks, please take note.

I've found some questionable things in the May, 1986 draft that I'd like to
open up for discussion by the net.  (I don't have a more recent copy, or even
know if there is one; my apologies if some of these issues have already been
closed.)

In 3.2.2.3, "(void *)0" is called a null pointer constant, though 5.6.3.12
says the value of NULL is implementation-defined.  I take this to mean that
the internal representation of (void *)0 need not have all bits clear.  The
constant 0 used in a pointer context still denotes the null pointer, but an
integer variable whose value happens to be zero need not produce the null
pointer when cast.  Also, if I have a pointer variable whose value happens
to be NULL and I cast it into int, I'll likely get the internal form (what
I'd get if I used a union) rather than zero as a result, right?  In boolean
context ("if (p)"), there is an implied comparison with NULL, not an implied
cast to int.  (In the above, I am assuming sizeof(int) == sizeof(void *).)
Do I have this right?

In 3.3.3 and 3.3.4, we find the definitions of _u_n_a_r_y_-_e_x_p_r_e_s_s_i_o_n and _c_a_s_t_-\
_e_x_p_r_e_s_s_i_o_n.  Wouldn't it have been simpler to define cast to be a unary
operator?  (That's the way *I* always think of it.)  In other words, it seems
one could add "( _t_y_p_e_-_n_a_m_e ) _u_n_a_r_y_-_e_x_p_r_e_s_s_i_o_n" to the 3.3.3 syntax and move
section 3.3.4 to 3.3.3.x.  Am I overlooking something?

In 4.2.1.1: "If NDEBUG is defined ... use of the assert macro will have no
effect."  This needs to be clarified as follows: "The expression {shall|may|
shall not} be evaluated for side effects."  (Consider e.g. "assert(getchar()\
==EOF)", for example.)  The UNIX* version currently does not evaluate it; it
sometimes would be more useful if it did.  In any case the ambiguity needs to
be resolved.

In 4.7.1.1, we see that all signals are initialized to SIG_DFL, a surprising
contrast to the way things currently work on UNIX, where signals can be set
to ignore (by nohup(1) or other means) before the program starts.  I see two
interpretations:

    [a] SIG_DFL means "die on this signal"; programs like nohup can no longer
	be written.

    [b] If nohup presets the signal to be ignored, then for the duration of
	the program exec'd from it, SIG_DFL means "ignore this signal".

The first interpretation is clearly a problem.  The second is more subtle.  I
think "if (signal(SIGINT, SIG_IGN) == SIG_IGN) signal(SIGINT, trap);" would
now be broken, because the first signal() will return SIG_DFL rather than
SIG_IGN.

In 4.9.3: "When opened, the standard streams are fully buffered if and only
if the stream does not refer to an interactive device."  So, stderr isn't a
special case anymore?  (It's currently unbuffered.)

In 4.9.6.1 and 4.9.6.2 (fprintf and fscanf), it says that any conversion
specifier other than those listed results in undefined behavior if it's a
lower-case letter, otherwise implementation-defined behavior.  In other
words, the lower-case letters are reserved for future official releases, and
the other characters are reserved to the implementation.  This is a bad idea.
Note that the flag characters occupy the same "namespace" as the conversion
specifiers; future non-alphabetic flags could potentially clash with some
implementation's conversion specifier.  I suggest that there should be *one*
character reserved for the implementation, which can be used as a prefix.
E.g. if I want my library to have a binary-output format, I could then use
"%&b" if "&" is the prefix.  Sort of like a run-time #pragma.

In 4.9.6.2, the documentation for the "[" conversion specifier (which scans a
string not delimited by white space) fails to mention a range syntax, but the
example on the next page uses "%[0-9]".  Also, it fails to mention whether
one can include a right-bracket in the scanset; most implementations allow
it as the first character (following the "^", if any).  (Unless they want to
allow an empty scanset, in which case *that* should be documented.)

In 4.10.4.4, we see that any function (presumably the user's) which is
registered by onexit() should take zero arguments and return a value of type
onexit_t.  No mention is made of what this return value should be, or who
will look at it.  The function onexit() itself returns type onexit_t, but the
only comment is that it "compares equal to a null pointer constant" if it
succeeds.  (Aha, so onexit_t must be some kind of pointer!)  If the user is
only expected to test for success/failure, why not just return an int?  It
seems to me that this could have been declared "int onexit(void (*)(void))",
omitting the onexit_t stuff completely.

But, back in 4.10 it was claimed that onexit_t "is the type of both the
argument to and the value returned by the onexit function".  One wonders if
"argument to [onexit]" should have been "result of the function pointed to by
the argument to [onexit]".  If not, and we take it literally, we conclude
that "onexit_t" is a synonym for "onexit_t (*)(void)", an undeclarable type.
Someone hinted at this some months ago in an earlier posting, so maybe it's
true.  But why?

Finally, in 5.1.3 we have a really serious error :-).  "#error" and "#pragma"
are listed in the wrong font.

Karl W. Z. Heuer (ihnp4!ima!haddock!karl), The Walking Lint
*UNIX is a trademark of AT&T Bell Laboratories.