Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!csd4.milw.wisc.edu!leah!rpi!rpi.edu!deven
From: deven@pawl.rpi.edu (Deven Corzine)
Newsgroups: comp.sys.amiga.tech
Subject: Unix V7 functionality under (or along with) AmigaDOS?  (*LONG*)
Keywords: Unix V7 Minix AmigaDOS shells ... CATS?  Randell??  :-)
Message-ID: <DEVEN.89Mar6082649@daniel.pawl.rpi.edu>
Date: 6 Mar 89 13:26:49 GMT
References: <DEVEN.89Mar1180622@daniel.pawl.rpi.edu> <6124@cbmvax.UUCP> <DEVEN.89Mar3002118@daniel.pawl.rpi.edu> <6140@cbmvax.UUCP>
Sender: usenet@rpi.edu
Reply-To: shadow@pawl.rpi.edu
Organization: RPI Public Access Workstation Lab, Troy NY
Lines: 424
In-reply-to: jesup@cbmvax.UUCP's message of 3 Mar 89 20:43:56 GMT

In article <6140@cbmvax.UUCP> jesup@cbmvax.UUCP (Randell Jesup) writes:
>In article <DEVEN.89Mar3002118@daniel.pawl.rpi.edu> shadow@pawl.rpi.edu writes:

>	I remember these problems well, I wrote a csh clone, SeaShell, before
>I came to commodore.

Hmm.  I have a shell you wrote several years ago resembling Matt
Dillon's shell.  (I got the disk from Sandro)  Is that the one?
(Hardcoded to "expire" sometime in 1986?  [setting back the clock
works great, by the way...  :-)]  I'll surely erase it soon, but the
extra utilities you had on the disk seemed to be of some possible
value...)

>>>>Also, when you run a resident command from the CLI or run, does Exit()
>>>>have a check for a resident seglist to decide whether or not to
>>>>UnLoadSeg() the seglist, or does it pass control back to the CLI or
>>>
>>>C programs should not be using Exit() which never matches the
>>>exit things your compiler startup wants you to clean up.
>>
>>1. Some programs still use Exit(); it's Amiga specific, not compiler
>>specific.
>>
>>2. Some programs are simply naughty and use it.
>
>	Yup.

I want a shell which will work correctly whether or not exit(),
_exit() or Exit() is called, and without regard to what compiler is
being used.  I want the shell to compile under Lattice or Aztec (but
I'm using Lattice) or any other compiler there may be (PDC if it
works, etc.)

>>3. The compiler exit() calls the compiler _exit() which calls Exit().
>
>	Not Lattice.
>
>>Regardless, I'm interested in the point where Exit() is called, not
>>where exit() is called.  I could use Lattice's onexit() function, but
>>I don't want to tie it to any single compiler and I don't want to use
>>the one onexit() available.  Also, I still want to catch it even if
>>Exit() is called by the program...
>
>	Lattice pops the stack back to the starting point, then does an RTS.
>The trick to making Exit() work is in the setting of pr_ReturnAddr.  Note
>this is somewhat magic.

So to have Exit() handled correctly, have the startup module (or
perhaps the shell?) look up the return address at the start of the
stack and copy it to pr_ReturnAddr?  Or must something else be done?

>>>The shell has the responsibility of unloading (or not unloading
>>>in the case of resident segments) the programs.
>>
>>But does this hold true when the shell is NOT running under the CLI?
>>It will be running as a DOS Process, but NOT as a CLI process.  The
>>manuals imply that Exit() frees the seglist for a DOS process but not
>>for a CLI process.  I want Exit() to leave it alone and let the shell
>>handle the seglist.  Under exactly what circumstances will Exit()
>>release memory?  (Or ANY resources...)
>
>	Shells are shells, period.  The Workbench is a visual shell.
>Exit() itself does almost nothing, except return control to the "shell"
>as if the program had exited.

So Exit() only ever releases resources if pr_ReturnAddr points to some
such cleanup routine?  What does CreateProc() normally initialize
pr_ReturnAddr to, then?  Exactly such a routine?

>>Also, where can I safely store extra an extra data structure for the
>>process, without conflicting with anything?  Several possibilities
>>come to mind.
>
>	Nowhere, unless (a) you're willing to use AddTask for everything,
>and therefor are willing to reimplement the functionality of CreateProc
>(fully, including endcode for cleaning up things like directory locks,
>pr_SegList pointer (which is bizarre), etc).

I'm willing to use AddTask if it will still operate as a DOS process
and be able to call dos.library routines.  How should I go about doing
this?  Also, what makes pr_SegList bizare?  (aside from usage of
BPTRs?)

Hmm.  Looking at "The AmigaDOS manual," (AmigaDOS V1.1, I believe;
published February 1986.) it lists the Process structure starting with
BPTR SegArray, describing it as an array of SegList pointers with its
size in the first longword.  [seems consistent with BSTR's.]
"CreateProc creates this array with the first two elements of the
array pointing to resident code and the third element being the
SegList passed as argument.  When a process terminates, FreeMem is
used to return the space for the SegArray."

On the other hand, in the <libraries/dosextens.h> include file, the
Process struct is defined with pr_task, pr_MsgPort, pr_Pad, and then
BPTR pr_SegList, not SegArray.  But the comment on the line says:
/* Array of seg lists used by this process */
Is there then no difference but the name used?

>>One, put CLI process number 0 in the dos Process structure (i.e. not a
>>CLI process) and then use the CLI structure pointer field to point to
>>my structure.	 Would this cause any incompatability problems?
>
>	I'd bet yes.

I thought as much.

>>Two, make an extended structure with a process structure as its first
>>element, and add fields after it for my own use.
>
>	See above.

Hmm.  This seems the only reasonable way to go; replace CreateProc(),
and use AddTask().  I take it the extra segments in the SegArray
(pr_SegList) are such as that cleanup code for CreateProc() which are
not to be unloaded.  May I safely assume that if I replace
CreateProc() with my own function which sets up a structure starting
with a Process structure (as the Process struct begins with a Task
structure) that I may use the pr_SegList field as I see fit to manage
the memory, without affecting operation of programs ran by the shell?
(i.e. does anything depend on the way CreateProc() sets up
pr_SegList?)

(running down the list...)

May pr_Pad safely be ignored?

pr_StackSize seems straightforward enough.

I don't understand what the pr_GlobVec field is used for.  Will
non-BCPL programs ever use it?  What do I need to initialize it to?

I'll set pr_TaskNum to zero.

For pr_StackBase, do I just allocate a memory area the size of
pr_StackSize, (and should I/need I allocate an extra longword to hold
the size of the allocated block, as AmigaDOS does?) and set
pr_StackBase to the last longword of the stack?  Or is it the last
byte?

Regarding pr_Result2, do I initialize it to zero and forget it, or do
I need to actually do something to get the secondary result from the
last call?  (And exactly what defines the secondary result?  The
Exit()/exit() value?)

For pr_CIS and pr_COS, should I set them to be the standard input and
output of the executed program, or set them to zero and define the
standard input, output and error channels in my own structure?  (I
want stderr as a passed file descriptor, along with stdin and
stdout...)  I want to break dependency from the CLI, yet still be able
to run programs which depend on the CLI themselves.  (possibly by
using or replacing the AmigaDOS RUN command.)

Similarly, should I zero pr_ConsoleTask and pr_FileSystemTask or not?
(I intend to have read(), write(), etc. calls...  functionally similar
to Lattice's, but not dependant on the Lattice compiler or lc.lib
library.)

pr_CLI will be set to zero.

What's the best way to set pr_ReturnAddr?  Set it to the PC plus a
constant?  JSR to a piece of code which stores the return address in
pr_ReturnAddr and JMP's to the code?  Hmm.  Silly question.  The
latter, of course.

I'll probably leave pr_PktWait as zero, at least for now.

I'm undecided on what to do with pr_WindowPtr.

My added structure would then follow.  Do you forsee any compatibility
problems caused by a setup like this?

>>Three, add a segment to the end of the segment list as follows:
>>
>>BPTR addseg(seglist,segsize)
>>BPTR seglist;
>>ULONG segsize;
>>{
>>>BPTR *segment, newsegment;
>>>APTR AllocMem();
>>
>>   segment=(BPTR *) BADDR(seglist);
>>   while(*segment) segment=(BPTR *) BADDR(*segment);
>>   newsegment=(BPTR *) AllocMem(segsize+8L,MEMF_PUBLIC|MEMF_CLEAR);
>>   *newsegment++=(BPTR) segsize;
>>   *segment=(BPTR) newsegment>>2;
>>   return(*segment);
>>}
>>
>>This is somewhat kludgy, to be sure, but it seems it should work.
>
>	Programs that seglist-split may be confused by this, as well as
>BCPL programs.

There is that.  I guess that idea is out.

>>On a related point, is this an equivalent function for UnLoadSeg(), or
>>does UnLoadSeg() do something else/more?
>
>	Much more.  First there's overlaid programs.  Releasing them is
>somewhat tricky, since they have open filehandles, and tables to be freed.
>Plus UnloadSeg must work for a NULL seglist.  And then there are "resident
>libraries" (yech).

Hmm.  Ok, then.  Maybe I won't try to rewrite UnLoadSeg(), at least
not unless I also rewrite LoadSeg() (which I may.)

>Another ex-RPI student...

Yes, I know.  :-)

Hey, maybe I could join CATS sometime!  :-)  (But being on the
Internet is just so nice...)

>Randell Jesup, Commodore Engineering {uunet|rutgers|allegra}!cbmvax!jesup

By the way, Randell, thanks for all your help...  (And I hope you'll
reply to this huge message, too...)

Time for a little more explanation of what I'm trying to do here.

I have programmed quite a bit under Unix and rather prefer the Unix
file system and system calls to many of those available on the Amiga.
I don't much care for AmigaDOS or BCPL.  Some of the user interface
issues between Unix and AmigaDOS are trivial, like "../" vs. "/" for
parent directory.

Other differences, such as the way directories are implemented, differ
more significantly.  Unix-style directories are far more efficient at
directory listing (but no the hash-lookup searching AmigaDOS does)
than AmigaDOS with its backpointers.  Also, the lack of links in
AmigaDOS is a big loss.

I want to write a shell which will be more cleanly implemented than
the CLI (i.e. no BCPL) and which will start programs with argv AND
envp arrays, and have Unix-type calls available.

I have the source code to Minix available, but I don't think I want to
do a direct port.  For one thing, there are some aspects of Minix I
would like to improve on, and Minix, while it would be very nice,
might not be ideal for an Amiga without a hard drive.

What I would like to do is write a file system based in part on Minix,
and in part on AmigaDOS (but *no* BCPL!) which would work more
effectively with floppy-based systems.  (i.e. Mount file systems like
in Minix/Unix, but volume-oriented, as in AmigaDOS.)

I DO rather like the low level Exec.  The ROM Kernal is quite well
designed.  I consider AmigaDOS to be rather poor, on the other hand.
It DOES contain some rather clever design features; assigned devices
are useful (though not good as a general replacement for environment
variables) and being able to mount devices is a definite plus.

Using BCPL for AmigaDOS seems a colossal error.  Exec was done in C,
but not AmigaDOS.  (I've often wondered just what TriPOS was/is like,
though...)  I understand pressures to get the OS out the door, but it
would be best to get rid of BCPL stuff totally, and the sooner the
better.  Already AmigaDOS is getting entrenched enough that it may be
too late.  But I do hope that AmigaDOS V2.0 will be written in and for
C code, with no BCPL thrown in.

Nice as it would be to rewrite the file system, I'm not quite ready to
do that just yet.  I don't have the time to spare, either.  (Sure, go
ahead and call my a hypocrite.  It's not *my* responsibility to
support the Amiga.  I have an excuse.  :-)

But what I DO want now is the fork() and exec() system calls from
Unix.  (execve() for purists)  And I'm willing to take a stab at
writing them myself.

Being the simpler of the two, let's look at fork() first.

I want a *real* fork(), not a fake one like the Lattice library
offers, which is similar in function to Execute().  (Well, closer to
LoadSeg() followed by CreateProc, actually.)  I want a fork() which
duplicates the current process with the only difference being the
returned value, as in Unix.

Fork() would start with a FindTask(0L) call, allocate space for a new
Task structure, copy the data over, and similarly duplicate the text
and data segments, modifying the appropriate fields to point to the
copies, and link in the new task to Exec with either AddTask()
(obviously preferable) or manually if necessary.

I'm afraid I don't have the RKM's with me, so I can't refer to the
semantics of AddTask().  <proto/exec.h> defines it as:

void AddTask(struct Task *, char *, char *);

The first argument is clearly the (initialized?) Task structure.  I
can't think of what the other two are offhand.  (I'll look it up
later, I guess.)

Anyhow, I see several points of possible difficulty.

First, is whether this fork() routine would need to disable
multitasking (interrupts seem fine) at all, such as when copying the
Task structure of the current process.  Or, is it perfectly safe and
consistent (no race conditions) to let multitasking continue
unhindered while it initializes the new Task structure?  (Clearly, it
would be a preferable approach, but only if consistently safe.)

Second, how should the fork() call make execution begin at the return
point of said fork() call?  (Perhaps this question will answer itself
when I look up AddTask().)

Third, how to make fork() return different values to the two
processes?  (I suppose by jumping to a different portion of the fork()
function with the AddTask() call, so the function can return two
different values to the two tasks.)

Finally, (I hope) how to handle file descriptors.  The solution for
this would seem to be to have file i/o operations in this library -
open(), close(), read(), write(), etc. and have a table of file
descriptors, file pointers, and AmigaDOS file handles (for now, at
least.) that the "parent" and "child" share.  I don't know whether or
not to try to preserve the parent/child relationship as Unix does, or
try some other setup.  I'll have to think about it.

So much for the "easy" one.  Phew!  Now, the hard one.  Coding exec().

Here is the steps Minix uses for its exec() system call:

1. Check permissions - is the file executable?
2. Read the header to get the segment and total sizes.
3. Fetch the arguments and environment from the caller.
4. Release the old memory and allocate the new one.
5. Copy stack to new memory image.
6. Copy text and data segments to new memory image.
7. Check for and handle setuid, setgid bits.
8. Fix up process table entry.
9. Tell kernal that process is now runnable.

Now, some difficulties are immediately apparent.

One significant obstacle is attempting to have an exec() call as a
scanned library linked with to make an executable program.  As such,
the text image to be completely replaced by the new program will
contain the code to do the replacing.  Clearly this poses a problem.

If exec() was a real system call - if it was another function in
exec.library, there would be no big problem, as the code to replace
the text image would be separate from that image.

I could put exec() in a run-time library, true.  But I prefer to avoid
doing so.  It's one more run-time library you must remember to have
available to use the program.  It is quite annoying to try to use ARP
commands and get an error because you don't have arp.library (V34+) in
LIBS:.  So, unless the functions prove to be prohibitively large, I
prefer to have them as compile-time libraries, to simplify things for
the end user.

So, there is the problem of separating the exec() call from the text
image.  What would seem to be the most usable method would be to have
exec() start by duplicating itself (sans copying code) into a newly
allocated single-segment seglist, which is passed to CreateProc() with
a high priority and a minimal stack.  This is somewhat of a kludge,
but should be workable, as CreateProc() has the cleanup code for the
exec() function itself, once it has done its work and is no longer
needed.

If the setup were later modified to work as a run-time library, the
compile-time library could be easily changed to accomodate the
modification, with no program changes necessary, and recompilation
optional.

Actually, the checks for the exec() would be performed first,
including allocating memory for the new text and data segments, and
only once it is ready to replace the current process would it create
the seglist of the function to finish the job and CreateProc() it.  As
an argument to that part of the function, would be the process
structure for the process to replace.  (I suppose this would actually
be put either in a second segment in the seglist, or near the start or
at the end of the segment with the code.  As a separate [data] segment
seems best, no?)

If any of the checks, allocations, or the CreateProc were to fail,
then the exec() call would deallocate any allocated memory, and return
with a -1 and an error number, probably in the (global) variable errno.
Otherwise, it would simply wait forever, (pick some event to wait for
which will never happen) and the newly created process would RemTask
the task and handle the switchover and deallocation of the original
task, and starting the new process up.  (Then it would return, falling
into CreateProc's cleanup code.)

How to handle resource deallocation of the old process presents yet
another difficulty.  (oh, but of course!)

The file descriptors would be preserved, but memory and resources
allocated by the prior process need to be released.  One solution (the
cheap and easy way out) is to free malloc() allocated memory (if I
write a malloc(), etc. set of routines) and leave other resources
allocated, (such as memory gained from AllocMem) to allow resources to
be passed to the called program.

Alternatively, most allocation types could be duplicated in the
library, with tracking versions, for automatic deallocation upon exit
or exec().  (Oboy, now I AM digging myself a grave here, aren't I?)

Hmm.  I see potential problems with startup and cleanup code as well.
*sigh*  I guess I really would be better off making it a resident
library.  In that case, it would need to be set up by an
initialization program ("Unix" bootstrap program) which would install
the library in ram, start an "init" process, and then hang around as
the "system task" to coordinate everything.

And maybe add the file system and memory management tasks Minix has as
well.  I suppose adding simplistic ones to start with would be ok.
(Simplistic meaning they just process the requests by calling existing
Exec and AmigaDOS library routines, for now.)

Well, my mind is getting befuddled now, and I'm tired of typing and
this message is already too damn long.  So, I'll cut it off here.

Someone... anyone...  please reply.  I wouldn't want this to be a
wasted effort.  (And more information/ideas is always helpful.)

Enough for now.

Deven
--
------- shadow@pawl.rpi.edu ------- Deven Thomas Corzine ---------------------
Cogito  shadow@acm.rpi.edu          2346 15th Street            Pi-Rho America
ergo    userfxb6@rpitsmts.bitnet    Troy, NY 12180-2306         (518) 272-5847
sum...     In the immortal words of Socrates:  "I drank what?"     ...I think.