Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!tut.cis.ohio-state.edu!ucbvax!hoptoad!tim From: tim@hoptoad.uucp (Tim Maroney) Newsgroups: comp.sys.mac.programmer Subject: Re: Reading Between the Lines Message-ID: <7015@hoptoad.uucp> Date: 15 Apr 89 19:38:34 GMT References: <451@biar.UUCP> <28839@apple.Apple.COM> <4012@ece-csc.UUCP> <6987@hoptoad.uucp> <4015@ece-csc.UUCP> Reply-To: tim@hoptoad.UUCP (Tim Maroney) Organization: Eclectic Software, San Francisco Lines: 122 In article <4015@ece-csc.UUCP> jnh@ece-csc.UUCP (Joseph Nathan Hall) writes: >Your comments about printing (deleted from above) are well taken. But >so far as reading in "newline" mode goes: > > 1) Virtually all programming environments on all major operating > systems support this (C, Pascal, FORTRAN, etc., on UNIX, VMS, > MS-DOS, etc.) at a reasonably low level in a reasonably > versatile manner. Actually, C only provides this at a high level. UNIX only provides it with low-level I/O to certain devices such as terminals. (There may be an fcntl I don't know about to do this on any file, but it's not commonly used if so.) The C high-level I/O routines are available in all Mac C implementations with which I'm familiar. The same can be said for the Pascal readline and related routines. So C and Pascal do have this capability on the Mac; I don't see why the OS should be expected to provide it as well. > 2) Why would it have to be slow? The most common mid- to high-level > UNIX I/O is streamed and it's not a problem. I just want to > read mid-sized text files (xx-xxxK); I don't want to sector- > copy volumes... Context switching is a big consideration; efficiency demands that you call the OS as infrequently as possible. There's also the issue of fetching entire disk blocks at once, though this is less relevant on a caching OS like UNIX or the Mac. Expect context switching into the kernel to become even more expensive on the Mac as the OS becomes more sophisticated. I'd also point you to Earle's empirical measurements confirming this effect, and I also noticed that increasing buffer size in the TOPS Terminal text editor's open operation greatly improved speed (even when the buffers were already bigger than disk blocks). Another message from John Gilmore has informed me that one-disk-block at a time fetches can actually be more efficient if the OS implements a block prefetch capability. That way, you are processing block n at the same time the OS is fetching block n+1. However, the Mac doesn't do this, and because disk I/O on the Mac is so processor intensive, it may never do so. > 3) Most programming tasks handled by the Toolbox and other system > software aren't beyond the skills of professional programmers. > That's not the point. Fast character I/O (buffered by the > system or language run-time routines) and line-at-a-time I/O are > SIMPLE and USEFUL tools. During development, who CARES if I/O > performance is -25% of optimum? I do, for one. A lack of efficiency in a program under development slows down the edit-compile-test cycle significantly. I spend enough of my life waiting for compilers, I don't want the testing to be slow as well. But in any case, the capability you want is available in both C and Pascal using high-level I/O routines that handle their own buffering, so I don't see what you're complaining about. > 4) Anyway, I disagree with the simple assertion that the "key to speed" > is reading as much as possible at a time from disk. This is just > not true. You can't hope for much improvement in I/O performance > once your buffer size exceeds the controller's buffer size. > You may even suffer a speed *penalty* if you do random I/O on a > fragmented file with a buffer size > sector size. Furthermore, > in situations where it is necessary to scan the file being read > character-by-character anyway (when reading a text file, or > parsing a source that must be kept on disk, or whatever) the > overhead of a system-based character I/O routine can be > negligible. In systems that provide fast character I/O, it can > be SLOWER to do the "raw" block reads yourself if you have to > write the supporting code in a HLL. This is speculation. The experience of those of us who've actually written text file readers on the Mac is quite different. In general, the larger the buffer, the less time spent in Read, and the difference is large enough that the user will notice for any large file. Aside from the trap dispatch (context switch) overhead, there's a per-Read overhead which apparently comes from consultation of the disk directories to find the proper physical blocks, as well as the mechanics of request queueing and so forth. >The lack of explicit Toolbox support for non-block-oriented file I/O is, at >best, a weird omission. Sure, it's there, but >hiss< >boo< it's buried in >the low-level routines and it's not well documented. Are we just not >SUPPOSED to read text files on the Mac the same way we read them on any >machine? Sheesh. I have to say, if you're reading a line at a time on any machine, it's likely you're taking a performance hit. And writing a loop to turn blocks into lines on your own is so easy that a first-semester programmer could do it. I don't see why it should have been included in the Toolbox at all -- it isn't in UNIX, which you cite as a favorable example -- but since it is, your complaints make even less sense. If you want to write your own block-structured high-level FSRead-style call, it's trivial to do so, and you can easily contain the supposed complexity to this single short routine. (But why are people phobic about parameter blocks?) OSErr FSReadLine(refNum, count, buffPtr) short refNum; long *count; char *buffPtr; { IOParam io; io.ioRefNum = refNum; io.ioBuffer = buffPtr; io.ioReqCount = *count; io.ioPosMode = fsFromMark | 0x0d80; io.ioPosOffset = 0; PBRead(&io, false); *count = io.ioActCount; return io.ioResult; } I haven't tested this, so I can't guarantee it works, but it's certainly close to what you want. (You may have to explicitly move the mark forward when you use fsFromMark instead of fsAtMark; I don't know, and the documentation seems ambiguous on this point.) Is the lack of a trivial routine like this in Inside Macintosh really a problem? -- Tim Maroney, Consultant, Eclectic Software, sun!hoptoad!tim These are not my opinions, those of my ex-employers, my old schools, my relatives, my friends, or really any rational person whatsoever.