Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!sdd.hp.com!think.com!mintaka!spdcc!ima!dirtydog!suitti From: suitti@ima.isc.com (Stephen Uitti) Newsgroups: comp.arch Subject: Re: Sun bogosities, including MMU thrashing Message-ID: <1991Jan24.193458.16429@dirtydog.ima.isc.com> Date: 24 Jan 91 19:34:58 GMT References: <5257@auspex.auspex.com> <3956@skye.ed.ac.uk> <5390@auspex.auspex.com> <1991Jan21.225211.17757@gpu.utcs.utoronto.ca> Sender: news@dirtydog.ima.isc.com (NEWS ADMIN) Reply-To: suitti@ima.isc.com (Stephen Uitti) Organization: Interactive Systems, Cambridge, MA 02138-5302 Lines: 88 In article pcg@cs.aber.ac.uk (Piercarlo Grandi) writes: >On 21 Jan 91 22:52:11 GMT, dennis@gpu.utcs.utoronto.ca (Dennis Ferguson) said: >dennis> And I distinctly remember arguments being made at the time to >dennis> the effect that the speed of the Berkeley fast file system >dennis> (still a fairly recent innovation then) was almost exclusively >dennis> due to the larger block size, and that the block clustering >dennis> algorithm, which makes the supporting code complex and >dennis> relatively CPU-intensive when writing, really was unnecessary. > >Yes, on that type of machine (stupid disc controllers, timesharing) >that's true. The question: is the merit of the improvement because of >fixed static clustering or because of its consequences in the case of >sequential access? .... Interactive Systems has a file system based on the V.3.2 (essentially V7) file system, but with bitmapped block allocation. You can increase the static block size, but that doesn't buy anything, anymore. It improves file system performance by a factor of 10 to 15, on good hardware by my benchmarks. There is a vast difference between a good disk controller and a lesser disk controller. The drivers make use of track buffering, if available. A controller which can read a track in one revolution performs as if rotational latency is not an issue. With read-ahead, and contiguous or near contiguous files, sequential access is very fast. Random access is still pretty fast, mostly because the blocks are still close to each other. The overhead of figuring out where they are is still an issue. The straight V.3.2 file system was so slow that a 386/25 with a 300 MB 17 millisecond average access disk drive was good for only a small number of engineers, about four. With the software change, you can really run thirty. Of course, we don't do that. We run X windows instead. I used to think that using a file system as complicated as the Berkeley Fast File System would not be as fast as a simpler file system with a bitmap access. If it is a nontrivial exercise to figure out where the next block of a file might be, you won't do it quickly. V.4 shows us that if you throw all of your memory at buffering, then you can make it fast. I can't help but believe there is some better use for memory. Perhaps a relatively simple file system using sorted extent lists would provide a faster system for sequential and random access. If the file system were implemented in a user process under UNIX, you'd be able to profile the system, and figure out what the latencies really are for each part. Then if you made interprocess communication fast... It would also be nice if the inode information were near something else. For that matter it would be nice if inodes had a larger name space than 16 bits (V7) and could be allocated dynamically, rather than statically (at mkfs time). A good spot would be in directories. This makes hard links more difficult. This should not be a deterrent. >The billjoys will say "so what, sequential access is 80%, so we go for >it, and damn the rest!". In billjoys defense, before this change, it was 100% slow, rather than just 20% slow. >Thus sequential access is used also where under a more balanced design >random access would be used. For example UNIX editors traditionally >copied the file to edit twice sequentially on every edit. Now they load >it into memory and write it back from there, which gives most VM >subsystems the fits, for other similar reasons. I wrote a text editor for my record-oriented calculator. Memory and file space are shared in this environment. I decided that I couldn't afford two copies of the file. I only read enough of the file for the current display (not alot). When the user makes a change, I remember the change, where it goes, and what was deleted. I only write the new file at the end, if the user wants it. I write the file by deleting all records that were marked for deletion, creating the new file, then cleaning up the remainder. I decided that a similar system could be used for a block oriented editor for larger systems. It would handle arbitrarily large files, be able to make changes to single files that nearly fill disks, have very fast startup (since it only has to read the first screen), would not chew up the VM needlessly, etc. One day, I'll write the editor's file handling subroutine library. Stephen. suitti@ima.isc.com "We Americans want peace, and it is now evident that we must be prepared to demand it. For other peoples have wanted peace, and the peace they received was the peace of death." - the Most Rev. Francis J. Spellman, Archbishop of New York. 22 September, 1940