Path: utzoo!utgpu!news-server.csri.toronto.edu!rutgers!ucsd!sdd.hp.com!spool.mu.edu!uunet!zephyr.ens.tek.com!gvgpsa!treehouse!andyp From: andyp@treehouse.UUCP (Andy Peterman) Newsgroups: comp.sys.mac.programmer Subject: Re: HFS, extent, B*-trees, oh joy... Message-ID: <804@treehouse.UUCP> Date: 14 Feb 91 22:18:43 GMT References: <1991Feb13.210702.9407@athena.mit.edu> Distribution: usa Organization: The Treehouse Lines: 95 In article <1991Feb13.210702.9407@athena.mit.edu> captkidd@athena.mit.edu (Ivan Cavero Belaunde) writes: >I'm trying to write a routine that takes a file identifier (filename and >WDRefNum or filename, dirID and vRefNum) and returns a pointer to a block >containing a list of the block numbers on that volume occupied by a file. >Although I understand the basic procedure pretty clearly, there are a couple >of stumbling blocks that are driving me crazy. See if you can find a manual for the old disk editor program Fedit Plus. It has a B-Tree description that goes beyond that of IM volume IV. I'll try to answer some of your questions in case you can't find it: >Basically a file consists of one or more extents (chunks of contiguous >blocks in the volume). Information about the extents is stored in a B*-tree, >on which each node has the following structure: > > ... [ stuff deleted] > >This structure doesn't seem to be very well defined, however. IM >says that all nodes (no matter what kind) are a fixed size, and thus >one can access the fields at the end which point to the records (which >can be of varying size). I haven't been able to find this fixed size >documented anywhere, however (anyone know?). (It's not absolutely necessary, >since I know the size records I'm dealing with, but it'd be better). The first node of a B-Tree is a header node (type 1). It will have three records. The first one, starting at an offet of $000E into the node, contains the following: $000E Logical sector number of the last leaf node $0012 The size of each B-Tree node (always 512 bytes or one sector) $0014 Maximum size of a key in the B-Tree. $0016 The total number of nodes in the B-Tree. $001A The number of free nodes in the B-Tree. You can see from the above that the size of a B-Tree node is specified at an offset of $0012 into the B-Tree file (first node, first record). The second record of the B-Tree header node (at $0078) is unused. The third record (at $00F8) contains a bit map of which sectors (nodes) are in use for the B-Tree file. >The extents B*-tree is not in RAM, however. It's in a file (second FCB in the >FCB buffer). So my question is: what's the format of this file (so I >can actually navigate the B*-tree)? Is there a header before the actual >data comes? And do the node pointers effectively translate to marks >within the extent file? To use either the extent or catalog b-tree file, get the Volume Control Block for the volume and get the refNums from vcbXTRef or vcbCtlRef (see IM-IV page 176-177). These files will alread be open and all you have to do is do a standard PBRead using these refnums. (I might as well mention it now - DO NOT WRITE TO EITHER OF THESE FILES!!!!). The files just contain the b-tree nodes - if you read the first 512 bytes of the file, you will get the header node as described above. The next 512 bytes will be node 2, etc. >Now it would seem that what I need to do is: > > 1) Open the file and get a refNum for it. > 2) Get the info from the FCB (file number) > 3) Get info from the corresponding VCB > (vcbAlBlSt for calculating actual block #s, > vcbXTRef [the extent file might not always be the second > FCB in the FCB buffer]) No. See above for an easier solution. > 4) Navigate the extent tree, searching for the first extent record, > then using the data to calculate the search key for the > next extent record, and so on, building the list of block #s > as we go. Well, sort of. If you're looking for the extent of a file, you need to start with the extent entry in its FCB (or catalog B-Tree entry). The first three extents of a file are stored in the fcbExtRec field of the FCB (see IM-IV page 180). Most files, unitl they get fragmented, do not contain entries in the extent b-tree - they just have one to three entries here. >Now, has anyone out there done this before? Are there any hairy issues as >far as moving the mark for the extent file FCB behind the file manager's >back? What about the extent cache, and is there a way to flush it (does it >need to be)? And finally, are my data structure definitions correct? I >can't seem to find formal definitions *anywhere*, and I just deduced those I haven't used the extents file, although I have read from the catalog b-tree file successfully. I don't remember if there is a cache problem, although I think I do a FlushVol before I try reading the file. Good luck!!! -- Andy Peterman | Opinions expressed treehouse!andyp@gvgpsa.gvg.tek.com | are definitely those of (916) 273-4569 | my employer!