Path: utzoo!censor!geac!torsqnt!lethe!yunexus!ists!helios.physics.utoronto.ca!news-server.csri.toronto.edu!cs.utexas.edu!sdd.hp.com!elroy.jpl.nasa.gov!ames!uhccux!munnari.oz.au!metro!socs.uts.edu.au!jeremy From: jeremy@socs.uts.edu.au (Jeremy Fitzhardinge) Newsgroups: comp.unix.internals Subject: Ideas for changes to Unix filesystem Message-ID: <1991Jan30.143326.16676@socs.uts.edu.au> Date: 30 Jan 91 14:33:26 GMT Organization: Mystaecle and Saecret Order of Dagon, Bexley Chapter Lines: 93 I've been having a few ideas about changes to the unix filesystem that may or may not be useful. I'd like comments, but not flames unless you're feeling really motivated. When I refer to "the" unix filesystem, I'm talking about a BSD FFS since the old "ILike14Charact" filesystem of System V R<4 seems to have faded out; however it is simpler to explain changes to, so I may use it for examples. I have 3 main ideas for change: 1 - a flink(char *path, int fd) system call/operation. It seems odd to be that you can open a file, unlink it from the filesystem, and then not be able to put it back as a file unless you actually copy it out. What I was thinking about was a system call that lets you make a new directory entry that refers to the inode of an open file. The syscall would allow the link to take place if the user can create a directory entry at the specified path which can point to the inode of the open file. The only security problem I can think of is this: it would be possible to link a file back into the filesystem into a publically accessable directory after some time, even if the path to the original file becomes closed. If this were a real problem, you'd have to use a utility like fuser to see what processes have what files open in the closed off area. However, this situation is only marginally different from just copying out the file, which would have less side effects anyway (like not incrementing the link count). 2 - insertion/deletion in the middle of a file without copying Inserting and deleting chunks from the middle of a file seems like a pretty common operation, yet it is algorithmically quite inefficent as a result of the way the filesystem is designed. What I was thinking about is having the logical size of each block in the indirect blocks, as well as their location. When I say "block" I'm refering to the smallest singly writeable unit onto some disk-like device - basically a SysV FS block as opposed to BSD's myriad of sectors/blocks/clusters etc. When the file is being used normally (new data being appended to the end) then all blocks but the last will have valid data in them. However when data is added into the middle of the file, a new block is inserted into the blocklist. If the insertion is in the middle of a currently existing block, then the block's logical size is truncated to the offset of the insertion into the block. The remainder is copied into the newly allocated block. The logical size of the new block is set to the remainder's size, and the filepointer is set to the end. Is the file is read, then it appears exactly the same, until new data is written. On a write, instead of overwriting existing data, the data is written to fill the remainder of the new block, thus increasing its logical size. When the logical size matches the physical size another block is inserted into the file. Rather then having separate "write with insert" operations (as i implied above), I think the best way of allowing program support would be an "insert" system call that inserts a certain amount of empty space into an open file at the current position. Naturally, the blocks are only inserted into the file, but are not actually allocated on disk. If a negative amount is specified then the space is closed up. If the file becomes too fragmented, then it can be just rewritten contigiously, which would fill up all gaps. This mechanism saves having to copy any of the actual file larger than a physical block size, but it does mean that there is quite a bit of shuffling about of the indirect blocks, which could make the operation hard to guarantee atomic. It might also be worth making insertion an attribute of a file when its created so that only files that need it have the overhead of logical block sizes in the indirect blocks. 3 - limited sized files This idea is essentially quite similar to the above - basically I've been sick of simple log files that grow and grow without bound, often making serious holes in a file system. The idea is simply this - create a file that has a certain maximum size. If there is a write to the end of the file that would normally grow the file, then rather than ignoring it, blocks from the front of the file are reallocated and reordered to hold the new data. I suppose the file size would be best be in units of filesystem blocks, however if implemented in conjunction with insertion/deletion, then this need not be the case. These are ideas that may be implemented in a filesystem that's currently being designed. I would quite like comments and ideas from fellow experienced Unix users/hackers. -- Jeremy Fitzhardinge:jeremy@ultima.socs.uts.edu.au jeremy@utscsd.csd.uts.edu.au Irregular adjective: I have a moral standpoint You are assertive He is aggressive