Xref: utzoo comp.unix.questions:16108 comp.unix.wizards:17946 comp.databases:3487 Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!ncar!tank!mimsy!chris From: chris@mimsy.UUCP (Chris Torek) Newsgroups: comp.unix.questions,comp.unix.wizards,comp.databases Subject: Re: [fl]seek mechanism Keywords: file seeks on large files Message-ID: <19382@mimsy.UUCP> Date: 2 Sep 89 04:19:21 GMT References: <1631@unccvax.UUCP> Organization: U of Maryland, Dept. of Computer Science, Coll. Pk., MD 20742 Lines: 32 In article <1631@unccvax.UUCP> cs00chs@unccvax.UUCP (charles spell) writes: >Does the kernal optimize seeks within an open file? This question is basically meaningless, because the kernel (note spelling) code for lseek---minus error checks, and with names expanded---is: fp = this_process.open_files[file_descriptor]; switch (whence) { case 0: fp->f_offset = offset; break; case 1: fp->f_offset += offset; break; case 2: fp->f_offset = fp->f_inode->i_file_size - offset; break; } return; Offsets from the end of the file are a tiny bit slower than other offsets due to the extra indirection required to get the file size. If a system call requires 100 machine instructions (this estimate is probably a bit low), case 2 might be 1% slower. >[to go from byte 500000 to byte 500001] with file descriptors: >fseek(fp, 1L, 1); -OR- fseek(fp, 500001L, 0); Presumably you mean `with stdio'. In general, existing stdio implementations are better with offsets from 0 than with offsets from `current point' or `end of file', so the latter would be faster. But `(void) getc(fp)' would be faster still. Stdio has to make two lseek calls per fseek, in the most general case, since it needs to first discover where it is (consider, e.g., `prog >> output', which might be at byte 5131 when it begins). -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163) Domain: chris@mimsy.umd.edu Path: uunet!mimsy!chris