Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!tut.cis.ohio-state.edu!cs.utexas.edu!rutgers!netnews.upenn.edu!eecae!cps3xx!rang From: rang@cpsin3.cps.msu.edu (Anton Rang) Newsgroups: comp.sys.mac.programmer Subject: Re: Reading Between the Lines Summary: There are reasons to have newline support in the OS. Keywords: newline, OS support, reading lines from files Message-ID: <2551@cps3xx.UUCP> Date: 16 Apr 89 00:22:57 GMT References: <451@biar.UUCP> <28839@apple.Apple.COM> <4012@ece-csc.UUCP> <6987@hoptoad.uucp> <4015@ece-csc.UUCP> <7015@hoptoad.uucp> Sender: usenet@cps3xx.UUCP Reply-To: rang@cpswh.cps.msu.edu (Anton Rang) Distribution: na Organization: Michigan State University, Computer Science Dept. Lines: 58 In-reply-to: tim@hoptoad.uucp's message of 15 Apr 89 19:38:34 GMT In article <7015@hoptoad.uucp> tim@hoptoad.uucp (Tim Maroney) wrote lots of stuff in reply to article <4015@ece-csc.UUCP> by jnh@ece-csc.UUCP (Joseph Nathan Hall). I've deleted the articles to save space.... 1. Why should an OS provide newline support when high-level languages also provide it? To make life easier for the developer of a HLL. Also, suppose that a program uses both C and Pascal, using both fgets() and readln(). If the OS provides the newline support then you don't have (much) duplication of code in the support libraries. 2. Using individual read calls is slow; why use them? Well, they're probably always slower than doing stuff at a very low level--I can write my own disk I/O routines and read stuff faster by totally bypassing the file manager. Just as one answer, maybe there's a reason I don't want to allocate a big fixed-size buffer for reading this file--after all, the smallest size which would make sense for a buffer is a disk block. Maybe I'm trying to conserve memory in an INIT; maybe I need to read the file without worrying about running out of memory in the process. 3. Why do stuff inefficiently during development which we'd make more efficient for a production program anyway? Perhaps I'm porting a program from another operating system. Maybe the newline character is different (gasp!)--I might not want to worry about fixing this up yet. As Tim pointed out, there isn't really anything to complain about here if you're using C or Pascal anyway. 4. A bit more complex. Joseph Hall claims that reading as much as possible on each read call isn't necessarily the key to speed. Tim says it's speculation. One point here--if allocating a 32K buffer to read a text file quickly means swapping out 32K of code from somewhere, this might be true. A procedure which counts the number of lines in a text file may well find that using a huge buffer is overkill. 5. A final note (of my own). Tim says that "if you're reading a line at a time on any machine, it's likely you're taking a performance hit." Just to make things a little more complicated, I'd just like to say that there are systems which do NOT require any specific character to mark the end of a line--if you say writeln() it writes out your data, whether it contains ^M or ^J or whatever. On these systems, reading data block-by-block and trying to figure out the end of a line is either near-impossible or just plain slow. [Quibble, quibble.] 6. Tim says "And writing a loop to turn blocks into lines on your ownn is so easy that a first-semester programmer could do it." Probably true. But writing an *efficient* loop probably means using assembly language, at least until some decent optimizing compilers are widely available on the Mac. I apologize (a little) for using net bandwidth on this. It probably doesn't really belong in this group.... +---------------------------+------------------------+----------------------+ | Anton Rang (grad student) | "VMS Forever!" | "Do worry...be SAD!" | | Michigan State University | rang@cpswh.cps.msu.edu | | +---------------------------+------------------------+----------------------+