Xref: utzoo comp.lang.c:28451 comp.lang.misc:4964 comp.sys.ibm.pc:50023 comp.sys.ibm.pc.programmer:1320 Path: utzoo!utgpu!news-server.csri.toronto.edu!mailrus!cs.utexas.edu!usc!snorkelwacker!husc6!m2c!wpi!jhallen From: jhallen@wpi.wpi.edu (Joseph H Allen) Newsgroups: comp.lang.c,comp.lang.misc,comp.sys.ibm.pc,comp.sys.ibm.pc.programmer Subject: Re: fast file copying (was questions about a backup program ...) Keywords: copy Message-ID: <12642@wpi.wpi.edu> Date: 4 May 90 10:27:29 GMT References: <255@uecok.UUCP> <1990Apr25.125806.20450@druid.uucp> <12578@wpi.wpi.edu> <24164@mimsy.umd.edu> Reply-To: jhallen@wpi.wpi.edu (Joseph H Allen) Organization: Worcester Polytechnic Institute, Worcester ,MA Lines: 38 In article <24164@mimsy.umd.edu> chris@mimsy.umd.edu (Chris Torek) writes: >In article <12578@wpi.wpi.edu> jhallen@wpi.wpi.edu (Joseph H Allen) writes: >>Interestingly, this aspect of the copy program [reading and writing very >>large blocks] is one place where I think DOS is sometimes faster than >>UNIX. I suspect that many UNIX versions of 'cp' use block-sized buffers. >>Doing so makes overly pessimistic assumptions about the amount of >>physical memory you're likely to get. >The optimal point >is often not `read the whole file into memory, then write it out of >memory', because this requires waiting for the entire file to come in >before figuring out where to put the new blocks for the output file. >It is better to get computation done while waiting for the disk to transfer >data, whenever this can be done without `getting behind'. Unix systems >use write-behind (also known as delayed write) schemes to help out here; >writers need use only block-sized buffers to avoid user-to-kernel copy >inefficiencies. On big, loaded, systems this is certainly true since you want full use of 'elevator' disk optimizing between multiple users. This should be the normal mode of operation. The problem with this on smaller UNIX systems is that whatever the disk interleave is will be missed unless there is very intelligent read-ahead. If you're lucky enough to have all your memory paged in, one read call may, if the system is designed right, read in contiguous sets of blocks without missing the interleave. For things like backups you usually want to tweak it a bit since this operation is slow and can usually be done when no one else is on the system. Also, for copying to tapes and raw disks, 'cp' is usually very bad. I think dd can be used to transfer large sets of blocks. On one system I know of, if you 'cp' between two raw floppy devices, the floppy lights will blink on and off for each sector. Also you have to be carefull about what is buffered and what isn't and happens when you mix the two. -- jhallen@wpi.wpi.edu (130.215.24.1)