Path: utzoo!telly!lethe!geac!yunexus!davecb
From: davecb@yunexus.YorkU.CA (David Collier-Brown)
Newsgroups: comp.arch
Subject: Re: Extremely Fast Filesystems
Message-ID: <13526@yunexus.YorkU.CA>
Date: 6 Aug 90 01:33:21 GMT
Article-I.D.: yunexus.13526
References: <5539@darkstar.ucsc.edu> <13285@yunexus.YorkU.CA> <30728@super.ORG>
Organization: York U. Computing Services
Lines: 49

puder@zeno.informatik.uni-kl.de (Arno Puder) writes:
| Tanenbaum's philosophy is that memory is getting cheaper and cheaper,
| so why not load the complete file into memory? This makes the server
| extremely efficient. Operations like OPEN or CLOSE on files are no
| longer needed (i.e. the complete file is loaded for each update).

rminnich@super.ORG (Ronald G Minnich) writes:
| This is very elegant, but there is 
| a problem. We're running out of address bits again. 
|
| I gave the standard "why shared memory is a nice way to do a high speed
| network interface" talk the other day and someone pointed out that on 
| Multics, with memory-mapped files, you always had to support the read-write 
| interface for any program because the address space of the machine was too 
| small for the memory-file abstraction to cover all files [...]

  To again misquote Morven's Metatheorum, ``any problem in computer
science can be solved with one more level of indirection...''
  This is dealt with by doing a transparent interface on top of the
large files, something like Multics MSFs, but done so the ill-advised
applications programmer (me --dave) won't depend on knowing how it was
implemented.
  I specifically considered large, relational, databases built on a bullet
fileserver!  The primitives provided to the DBMS would be read, write,
committ and abort on a pre-initiated relation.  Many relations would be
small enough to load into a proprely-configured fileserver, some would not.
The overlarge ones sould be slit two ways: transversely or longitudinally.
Transversely would be transparent to the application, if not to the human
DBM (he'd detect a performance loss, probably).  Longitudinally would
be visible to the applications (in a two-schema DBMS), because the DBM
would have to split them based on field-usage statistics. It wouldn't be
a problem in a three-schema architecture (modulo fiascos).

  Note that this is not a general answer to the problem, though.  Full
generality does require some form of ``extra-long address'', whether
implemented as a segment number sequence, a ``special large address'' in
either hardware or software, or a stdio FILE emulation library that only
provided it for seek/tell operations and hid it otherwise.

  I wouldn't mind tha latter too much: it's a nice interface for 90% of
the programs I've ever written, since they mostly read and wrote small
sequential files...  The other 10% took the other 90% of my time (:-)).

--dave
-- 
David Collier-Brown,  | davecb@Nexus.YorkU.CA, ...!yunexus!davecb or
72 Abitibi Ave.,      | {toronto area...}lethe!dave 
Willowdale, Ontario,  | "And the next 8 man-months came up like
CANADA. 416-223-8968  |   thunder across the bay" --david kipling