Path: utzoo!utgpu!water!watmath!orchid!atbowler From: atbowler@orchid.waterloo.edu (Alan T. Bowler [SDG]) Newsgroups: comp.os.misc Subject: Re: Contiguous files; extent based file systems Message-ID: <12210@orchid.waterloo.edu> Date: 5 Jan 88 00:41:09 GMT References: <561@amethyst.ma.arizona.edu> <3228@tut.cis.ohio-state.edu> <177@cullsj.UUCP> <1931@rti.UUCP> <517@usl-pc.UUCP> Reply-To: atbowler@orchid.waterloo.edu (Alan T. Bowler [SDG]) Distribution: na Organization: U. of Waterloo, Ontario Lines: 49 Extent based file systems can be very nice. If the file is contiguous file access random or sequential is very fast. The in memory representation is very compact. It is easy to change a program to use larger buffers, and this will almost certainly has a positive effect on program and system peformance. The problems occur when you need to grow an existing file. A lot of operating systems have gotten screwed up at this point and given extent based systems an undeserved bad name. Thus we have people in this group making incorrect statements about extent based systems in general, an giving examples from some badly implemented system. So far I have seen the following arguments: 1) You can't grow the file This is not true. You simply allocate another block and add it to the file description. 2) You can only grow the file "n" times. No you simply allocate an extension to the file description and put the new extent descriptor there. So far if you just follow the above rules you will degenerate into a fixed block scheme like Unix systems have classically used. Simply call the file descriptor extensions indirect blocks and you have essentially the same thing. 3) You have to specify an initial and maximum size when creating the file. No, but you can allow the user to do such a thing and be rewarded with a performance improvement. 4) The available space gets fragemented and you can't allocate a file because there is not a big enough chunk available. No, you of course thry to allocate new files contiguous but if you can't simply return the space as a set of chunks. The logical to physical mapping is done by the system so except for performance the user program can't tell. The real thing is that a extent based system requires a reasonably regular file system maintenance proceedure to "compact" multi-extent files into a single extent, so the user doesn't have to worry about this. One approach is to simply dump restore the whole file system once every few months. This in a reasonably adequate approach in most cases, but an even better approach involves running a daemon or other automatically scheduled program that once a day, or once a week simply walks through the system and rearranges files to compact multi-extent files, and arrange coalesce the available space so that large holes are available. This program does not have to do a "perfect" job. If a file is currently busy, it can simply skipp it and go on. It will probably get it the next time. There are real advantages to extent based systems, but like everything else they do not come for free. In this case there is a lot of work required to do a "complete" implementation job. Incomplete extent base systems seem to work for a while, but a user feel like he has hit a brick wall when he runs into a restriction imposed by an inncomplete implementation.