Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!linus!philabs!cmcl2!seismo!umcp-cs!cvl!umd5!zben From: zben@umd5.UUCP (Ben Cranston) Newsgroups: net.micro.amiga,net.unix-wizards Subject: Re: Speed of seeks Message-ID: <966@umd5.UUCP> Date: Fri, 16-May-86 18:09:07 EDT Article-I.D.: umd5.966 Posted: Fri May 16 18:09:07 1986 Date-Received: Sun, 18-May-86 15:26:02 EDT References: <12593@ucla-cs.ARPA> <645@baylor.UUCP> Reply-To: zben@umd5.UUCP (Ben Cranston) Distribution: net Organization: U of Md, CSC, College Park, Md Lines: 41 Xref: linus net.micro.amiga:6896 net.unix-wizards:15065 Summary: Yet another SEEK implementation In article <645@baylor.UUCP> peter@baylor.UUCP (Peter da Silva) writes: >Incidentally, despite the poor design of the files a seek() does not have to >read every sector... a mistake often made by library writers is to try to >make seek offsets simple integers. According to the library, the argument >to an absolute seek() (lseek(fd, off, 0) or lseek(fd, off, 2)) only needs >to be the returned value from a tell() call: it may indeed be a magic cookie >like a sector/offset pair (and in fact "magic cookie" is the way it's described >in the manual). It is under RSX/11M and on the ATARI 800. >This error is not restricted to relative newcomers: there's an IBM mainframe >implementation of 'C' that copies all files into fixed record length files >when you open them just so you can use UNIX-like seeks. If you want to do >a UNIX-like seek, build UNIX-like files (either one long "record" or a bunch >of maximum length records) so your offset calculations work. It's not >meaningful to seek to an unknown depth in a text file or other weird file >anyway. The Software Tools NOTE/SEEK design uses two Fortran integers to store SEEK addresses. The predominant text data format on the Sperry 1100 system is a variable length record, with the record length in a four byte header area. My implementation of the Tools for the Sperry uses the first of the two Fortran integers as the "character address within file" (i.e. 4 X wordaddr) and the second Fortran integer as "character number within this record", that is, how many characters back to go to get to the record header. The code uses this value to get "back in sync" after a random seek. This has the advantage that the first word of the address appears to be a normally-incrementing address, with 4-7 spaces between records. It would be possible to optimize NOTE address storage: if one knew that positions stored would always be at the beginning of record and the file was always ASCII one could keep just the first integer and supply "4" for the second. Oh, and if the character code is "Fieldata" (tm) rather than ASCII then the second word is negative. For historical reasons only... -- "We're taught to cherish what we have | Ben Cranston by what we have no longer..." | zben@umd2.umd.edu ...{seismo!umcp-cs,ihnp4!rlgvax}!cvl!umd5!zben