Path: utzoo!censor!geac!torsqnt!hybrid!scifi!bywater!uunet!auspex!guy From: guy@auspex.auspex.com (Guy Harris) Newsgroups: comp.arch Subject: Re: Sun bogosities, including MMU thrashing Message-ID: <5390@auspex.auspex.com> Date: 20 Jan 91 20:13:55 GMT References: <5257@auspex.auspex.com> <3956@skye.ed.ac.uk> Organization: Auspex Systems, Santa Clara Lines: 106 >A poor way of addressing some of these problems is to first double the >block size from one 512 byte sector to two (like in 2BSD), which is >however not bad (even if criticized by the Unix authors with compelling >arguments), and then raising the block size to 16 sectors, like in >SunOS. I presume you're not saying that SunOS has have fixed 16-sector block sizes; it has standard 8K-block, 1K-fragment BSD-style file systems, of the sort you describe later in your posting. >It also has some big disadvantages, hinted at by Thompson&Ritchie in >the V7 papers (they advised against doubling the block size from 1 to >2 sectors with terse and cogent reasoning). Which paper was that? >This greatly complicates life, especially as the kernel, up to SunOS 3, >has a cache of blocks. Having variable block sizes means having variable >buffer sizes (which complicates life quite a bit, as you must then put >the buffer cache in virtual memory). Which makes the buffer chunk size the page size; each buffer is backed by at least one page of physical memory. >Notice also that as another default only 10% of memory is reserved for >the buffer cache, which is, for your typical timesharing or program >development usage pattern, way too small. That is, BTW, standard BSD behavior (actually, 4.3 uses 10% of the first 2MB, 5% of the remaining memory). > For some reason there is still an 'nbuf' variable in the kernel to set > the number of buffer headers. It would be interesting to know what role > have buffer headers in the new architecture. From "SunOS Virtual Memory Implementation", in some EUUG proceedings: 8.3 UFS Control Information Another difficult issue related to the UFS file system and the VM system is dealing with the control information that the "vnode" driver uses to manage the logical file. For the UFS implementation, the control information consists of the "inodes", indirect blocks, cylinder groups, and super blocks. The control information is not part of the logical file and thus the control information still needs to be named by the block device offsets, not the logical file offsets. To provide the greatest flexibility we decided to retain the old buffer cache code with certain modifications for optional use by file system implementations. The biggest driving force behind this is that we did not want to rely on the system page size being smaller than or equal to the size of the control information for all file system implementations. Other reasons for maintaining parts of the old buffer cache code included some compatibility issues for customer written drivers and file systems. In current versions of SunOS, what's left of the old buffer cache is used strictly for UFS control buffers. We did improve the old buffer code so that buffers are allocated and freed dynamically. If no file system types choose to use the old buffer cache code (e.g., a diskless system), then no physical memory will be allocated to this pool. When the guffer cache is being used (e.g., for control information for UFS file systems), memory allocated to the buffer pool will be freed when demand for these system resources decreases. >Unfortunately, the Sun virtual memory technology has an 8KB page size. At least on Sun-3s, Sun-3xs, and Sun-4s. The "desktop SPARC" machines have a 4KB page size. I don't remember what the page size on the 386i was. (The page size on the Sun-2 was 2KB.) >This means that under SunOS 4 you effectively no longer have variable >sized buffers against the problem of internal fragmentation. All of >central memory uses only 8KB pages as the allocation unit. We can thus >contemplate the ridiculous situation that internal fragmentation is a >concern for disc space wastage, but not for memory space. Yeah, I thought so too, once; I asked Rob Gingell about it, and he pointed out that SunOS 4.x is no different from 4.2andupBSD in this regard, as noted above. Both of them allocate page-sized chunks for buffers. 4.2andupBSD just happens to have originally been done on a machine with 512-byte physical pages, and used 1024-byte "logical" pages. The 4.2BSD buffer pool code in SunOS 3.x used 8KB pages as the allocation unit on Sun-3s.... >A typical Sun VM cache has 8 slots, and each slot contains >the page table for a process. Probably true, given that most Suns out there are *probably* "Desktop SPARC" machines with 8 contexts, although the bigger SPARC machines have 16 or 64 contexts. >The Sun 4 SPARC MMU instead does not cache entire page tables, but >contiguous subsets of these, called 'pmeg's. That's actually not unique to "the Sun4 SPARC MMU"; all Sun MMUs work that way (as opposed to the Motorola MMU on Sun-3xs or the Intel MMU on Sun386is). >Each pmeg more or less maps a region of the address space, such as >text, data, bss, stack, or shared segment. More or less. Each pmeg maps a chunk of the address space, but it may take more than one pmeg to map such a region; the code that handles the MMU doesn't really know about text or data/bss (they're combined) or stack or....