Path: utzoo!utgpu!jarvis.csri.toronto.edu!cs.utexas.edu!samsung!uakari.primate.wisc.edu!uflorida!haven!mimsy!chris From: chris@mimsy.umd.edu (Chris Torek) Newsgroups: comp.std.c Subject: Re: ANSI draft interpretation questions Message-ID: <21690@mimsy.umd.edu> Date: 8 Jan 90 08:12:26 GMT References: <21623@mimsy.umd.edu> <11879@smoke.BRL.MIL> <21675@mimsy.umd.edu> <11897@smoke.BRL.MIL> Organization: U of Maryland, Dept. of Computer Science, Coll. Pk., MD 20742 Lines: 118 [me: (%n is a conversion, but is not an assignment; ...] In article <11897@smoke.BRL.MIL> gwyn@smoke.BRL.MIL (Doug Gwyn) writes: >No, %n involves both a conversion and an assignment, but no input action. Except that it is not counted in the return value. (Neither are suppressed assignments, but those are `suppressed assignments', not `assignments'.) %n, then, is a conversion and an assignment, but cannot be called an assignment because it is not counted as an assignment. This is the sort of thing that causes confusion as to whether `%*n' suppresses the (already not counted as an assignment) assignment. >> If the input has the form >> 0x >> 0X >> and the conversion is either `i' or `x', the sign (if any) and >> the zero are consumed; the `x' or `X' remains unconsumed. >Right (assuming that there is no hex digit immediately following the x). Oops, I meant to say `0'. Anyway, if the implementation of *scanf() uses lookahead to handle scanning, it needs at least three bytes of lookahead. If it uses pushback (which is legal but not required), the implementation must provide at least its own three bytes of pushback plus one more. Mine uses a combination of lookahead and pushback: it looks at the first remaining character in the buffer, and consumes it if it appears to be valid. If it later discovers that it was not, there might be one character of lookahead around, and two characters consumed that need to be pushed back; in this case, both are pushed back, and the implementation further guarantees at least one more pushback. Incidentally, it is not clear to me whether the standard requires the following to work. (The important line is marked with -> on the left.) #include #include "h_defs.h" /* for H_VALUE values */ /* * Assume `stream' is open to a read stream on which * the next few input characters are either * `h' or perhaps `hello'. * If the format is `h', stuff the value into * the given h_value pointer and return 1. Leave *h_value * unchanged otherwise. * * If there is an h, but it is not followed by a space or a digit, * leave the h and what follows it unconsumed. */ int find_h_value(FILE *stream, int *h_value) { int c, v, n, r; c = getc(stream); if (c != 'h') { /* nb: ungetc(EOF) fails; this is desired */ (void) ungetc(c, stream); return (HV_NO_H); /* no `h' */ } if ((r = fscanf(stream, " %n%d", &n, &v)) == EOF) { /* must have been an input failure: conk out */ return (HV_H_WITH_EOF); } if (r == 1) { *h_value = v; return (HV_WITHVALUE); /* got an h value */ } /* r must be 0 */ if (n == 0) { /* there was no white space: put back the `h' */ -> (void) ungetc('h', stream); return (HV_UNCHANGED); /* input stream unchanged */ } /* there were spaces, so we may not be able to put back the `h'; return a code saying `keyword h found, followed by something not an integer' */ return (HV_H_WITH_UNKNOWN_TEXT); } If n is zero, we know the scanf() did not consume any characters. We may therefore be required to allow the `h' to be pushed back. I am not sure. Consider an implementation similar to the old Unix one, however, in which one fills a buffer whenever a `getc' (or equivalent) is done on an empty buffer. Here we might have the following: A. buffer is nearly exhausted: it has one `h' left B. program does a `getc', which returns 'h': buffer now empty (at this point, ungetc('h') will work.) C. program calls fscanf() which calls __vfscanf(), which starts the ` ' directive, needs to skip spaces, and therefore refills the buffer D. __vfscanf() finds an `e' (from `hello', perhaps) and stops skipping spaces E. __vfscanf() executes `%n' directive, which stores 0 in n F. __vfscanf() tries to execute `%d', finds an `e', and stops with a matching failure (returns 0) At this point, there is probably no room in the input buffer to push back the `h'. Then again, the description for `ungetc' does not indicate that any `getc' must be done in advance. It says that one character of pushback is guaranteed. Perhaps this is meant to imply that FILE *foo = fopen("foo", "r"); if (foo == NULL) die(); (void) ungetc('a', foo); is guaranteed to push back an `a', so that the first getc(foo) returns 'a'. -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163) Domain: chris@cs.umd.edu Path: uunet!mimsy!chris