Xref: utzoo comp.arch:17298 comp.lang.functional:316 comp.lang.lisp:3436 comp.lang.prolog:2974 Path: utzoo!attcan!uunet!auspex!guy From: guy@auspex.auspex.com (Guy Harris) Newsgroups: comp.arch,comp.lang.functional,comp.lang.lisp,comp.lang.prolog Subject: Re: GC triggering and stack limit checking by MMU hardware Keywords: GC, stack, heap, MMU Message-ID: <3729@auspex.auspex.com> Date: 23 Jul 90 17:49:31 GMT References: <1990Jul19.151524.22544@diku.dk> <11075@alice.UUCP> Followup-To: comp.arch Organization: Auspex Systems, Santa Clara Lines: 58 >The problem is that certain machine architectures (eg. Motorola 68K) and OS >implementations (eg. SunOS at least in some versions) do not provide >a continuable address violation signal (SIGSEGV), even though at the kernel >level, address translation faults (page faults) are continuable. Not having >looked at the insides of those machine/OS combinations, I suspect that >enough instruction execution context can be saved for filling a page fault >in the kernel, but not enough for reentering the faulting process to >handle a Unix signal and allocate the needed storage. The amount of instruction execution context is the same in both cases; the only difference is where it has to be stored. I think SunOS 4.1 may store enough of it outside the kernel stack to permit *one* such fault to be continued from (i.e., don't expect to be able to return from a SIGSEGV that occurs while you're handling a SIGSEGV). Part of the problem is that Motorola: 1) wouldn't commit to the the "stack puke" stored by the 680[1andup]0 being "safe" to hand to user-mode code; i.e., they wouldn't say "nothing you can do to the 'stack puke' is risky"; and 2) wouldn't describe the format of the "stack puke" to the extent necessary to have the kernel validate it. I can see their not doing so as being perfectly legitimate; for all I know, different revisions of the same chip may have different "stack puke" formats, and even if they don't, they might not want any of that stuff to be considered a "feature", and have people then write code dependent on that stuff and lock them into continuing to provide those "features". It does, however, complicate the task of allowing user-mode code to continue from a SIGSEGV. >These observations are the result of practical experiments 5 or so years >ago with Sun 2's and VAXen running Berkeley Unix. The former could not >recover correctly from the segmentation violation (PC corrupted on return >from the signal), the latter could. The former has more context than the latter; the former has the 68010 "stack puke", the latter has, as I remember, just the First Part Done bit (and some of the general-purpose registers, for some of the long-running instructions like MOVCn). The latter is safe and, I think, appears in the "signal context" saved by a BSD signal, so that the instruction can be continued from user mode without the kernel having to tuck away one or more sets of context. >Newer machines/architectures might handle this better, I think most RISC machines (not entirely surprisingly) have less or no context of that sort; I'd expect things to work OK on a SPARC-based Sun, for example, as well as a MIPS-based machine. In fact, what architectures other than the 68K architectures have lots of context for that? I don't think the 386 or the WE32K, for example, have that problem.