Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!tut.cis.ohio-state.edu!ucbvax!decwrl!sun!pitstop!sundc!seismo!uunet!auspex!guy From: guy@auspex.UUCP (Guy Harris) Newsgroups: comp.lang.c Subject: Re: modification of strings Message-ID: <1029@auspex.UUCP> Date: 17 Feb 89 18:58:34 GMT References: <7429@csli.STANFORD.EDU> <466@oglvee.UUCP> <11711@haddock.ima.isc.com> <3656@arcturus> <3268@uhccux.uhcc.hawaii.edu> Reply-To: guy@auspex.UUCP (Guy Harris) Organization: Auspex Systems, Santa Clara Lines: 35 >Rumour has it that sscanf modifies strings passed as a first argument >on at least some machines (e.g. some suns?). "Some" Suns? Yeesh, "_doscan" isn't one of the machine-dependent modules; the same source is used on *all* Suns. In fact, the same source is used on a bunch of non-Sun machines as well; the SunOS 3.2-3.5 version is based on the S5R2 version, the SunOS 4.0 version is based on the S5R3 version, and the version in SunOS releases prior to 3.2 is based on the 4.2BSD version, which is probably based on the V7 version. The bug exists in S5 releases from AT&T, as well as 4.xBSD. The problem is that "*scanf" - or, to be precise, "_doscan" and the routines it calls, which are the "guts" of the "scanf" routines in many implementations - uses "ungetc". All very well and good when you're doing I/O to a file; "ungetc" stuffs the ungotten character back into the I/O buffer. However, the way "sprintf" and "sscanf" work in many (most?) UNIX C implementations is that it turns the string in question into a "funny" I/O buffer; however, most "ungetc" implementations don't understand this, and try to stuff the character back into the "buffer" anyway, which means they try to modify the string. >Well, it doesn't actually modify the contents, Which, in this particular case, is, I think, true; the character being stuffed back is a character that's just been "read" from the string. >but the compiler doesn't know that. It's not the compiler that has to know that; it's "ungetc". In "comp.bugs.4bsd" this very "sscanf" bug is being discussed; one suggested fix is to have "ungetc" check whether the character it's stuffing back into the buffer is the one that is in the buffer and, if so, just back up the buffer pointer and count.