Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Path: utzoo!watmath!clyde!rutgers!im4u!ut-sally!ut-ngp!infotel!pollux!bobkat!m5d
From: m5d@bobkat.UUCP
Newsgroups: comp.os.minix
Subject: Re: Minix and compiler models
Message-ID: <500@bobkat.UUCP>
Date: Fri, 30-Jan-87 12:23:27 EST
Article-I.D.: bobkat.500
Posted: Fri Jan 30 12:23:27 1987
Date-Received: Tue, 3-Feb-87 02:22:55 EST
References: <966@ulowell.cs.ulowell.edu> <1565@cit-vax.Caltech.Edu> <2289@orca.TEK.COM>
Reply-To: m5d@bobkat.UUCP (Mike McNally (dlsh))
Organization: Digital Lynx, Inc; Dallas, TX
Lines: 106

In article <2289@orca.TEK.COM> paulsc@orca.UUCP (Paul Scherf) writes:
>In article <1565@cit-vax.Caltech.Edu> jon@oddhack.UUCP (Jon Leech) writes:
>>	It's not clear to me from what I've read so far if minix supports
>>a long-model compiler. I guess this is really a question about ACK, but in
>>any case, I'd appreciate hearing from someone in the know. 
>
>I went to the MINIX BOF at USENIX last week. Andy T. said the
>compiler supports only the small model, but MINIX supports 64K
>text (code segment), 64K data (data segment) and 64K stack
>(stack segment) (like "split I and D" on some PDP-11s).
> ...
>Paul Scherf, Tektronix, Box 1000, MS 61-028, Wilsonville, OR, USA
>tektronix!orca!paulsc

I have BIG BIG problems believing that Minix allows separate code,
data, and stack segments.  This is NOT small model, at least not as
defined by Intel.  In fact, it's no model at all:

	Small:	
		one code seg, one data seg (incl. data, stack, and constants)
	Compact: 
		code, data, stack, and "memory"
	Medium: 
		one code seg. per module, only one each data, stack, "memory"
	Large: 
		one code seg & one data seg per module, one stack and 
        one "memory"

The "memory" segment is available from some Intel languages.  I think
there's another model in which everything is in one segment; this is
usually used for 8085 compatibility.  It could be that PC-DOS calls
this 8085 model "small".  There are also clever ways of having
"compact" sub-sections in a "large" system; see relevant Intel
literature (the PLM-86 guide is the most informative.

In all the models except "small" (or 8085 model, which is almost the
same for the purposes of this discussion), pointers MUST be 32 bits.
Why?  Because otherwise the location referenced by the pointer cannot
be determined.  Consider the C statemnt

    x = *p;

where "p" is a 16 bit pointer.  Which segment register should be used
to form the address?  Well, in "small" model we only have two choices,
so this MUST be a reference within the data segment, either to static
dtata, something on the stack, or a constant.  Note that if "p"
actually was initialized to the address of a function, the statement
above WOULD NOT retrieve the first word of the function.  On the other
hand, in this statement

    (*p)(x);

it is clear to the compiler that "p" is being used to point to code, and
so it will generate an indirect CALL.  In fact, because of the way the 
iAPX-86 uses the segment registers, the data-manipulation references
will use DS (or SS) without the compiler having to do a thing; likewise, 
the indirect CALL will fetch the value of the pointer, then form the
code address with CS.

Why do I think it's important that Minix use 16 bit pointers?  Because
it seems to me that having hard (i.e. 32 bit non-relocatable pointers)
pointers makes a correct implementation of fork() impossible.  Consider
this:

    ...
    char *p;

    p = func_which_returns_a_pointer(...);

    if (fork()) 
        *p = 'a';
    else
        *p = 'b';
    
During the call to fork(), the operating system must copy the data of
the parent process for the child (the code need not be copied).  If
pointers are 16 bits long, this works fine; the child process refers to
its own private copy of the space pointed to by "p".  The code works
porperly because when "p" is referenced the current value of DS is
consulted when the ultimate 20 bit address is formed.  Both processes
have different values in DS.  BUT, if pointers are 32 bits, then a
segment base address exists in the value of "p".  There is no possible
way that the operating system can find this absolute address and change
it during the fork().  The new process will have a different DS, but
the reference through "p" will in both processes refer to the exact
same memory location.

The problem of absolute pointers also arises if process swapping is to
be considered.

There is a possible solution to this.  If the compiler generates pseudo
code which is interpreted, then all problems dissappear.  The
interpreter can provide any view of memory it wants.  Alternatively,
the compiler could generate calls to library functions, or blocks of
code in-line, for each pointer reference.  Of course, there is cost in
the form of lower performance.  In the case of Minix on a 680?0 without
memory management hardware, I see a similar problem.

If anyone has some ideas on another way this could be solved, I'd be
interested.

-- 
		   |The first 15 minutes takes a long time;|
           +---the next 15 minutes takes forever---+
[[Mike McNally, a computer guy at Dallas's own Digital Lynx Inc.]]
[[uucp: {texsun,killer,infotel}!pollux!bobkat!m5d (214) 238-7474]]