Path: utzoo!mnetor!tmsoft!torsqnt!news-server.csri.toronto.edu!bonnie.concordia.ca!uunet!cs.utexas.edu!usc!zaphod.mps.ohio-state.edu!pacific.mps.ohio-state.edu!linac!att!ucbvax!ICAEN.UIOWA.EDU!dbfunk
From: dbfunk@ICAEN.UIOWA.EDU (David B Funk)
Newsgroups: comp.sys.apollo
Subject: Re: Diskless boot control
Message-ID: <9102140651.AA02565@icaen.uiowa.edu>
Date: 14 Feb 91 05:42:31 GMT
Sender: daemon@ucbvax.BERKELEY.EDU
Organization: Iowa Computer Aided Engineering Network, University of Iowa
Lines: 120

In posting <9102131559.AA02437@apo.esiee.fr>, bonnetf@apo.esiee.fr (bonnet-franck) says:

>> Here we have about 100 nodes and only 40 disked nodes,
>> very often some students reboot the diskless nodes anywhere
>> on the ring with the "DI N XXXX" command. 
>> 
  [stuff deleted]
>> 
>> In a word we want to CONTROL on which disked node a diskless node MUST boot.


In posting <9102132043.AA17329@richter.mit.edu>, krowitz@richter.mit.edu (David Krowitz)
replies with a good description of diskless node booting and ends with:

> Now as to what you can do under SR10.x ... one thing comes to mind ...
> When "netman" services a boot request it executes /sys/net/netman.rc,
> which is a link to either "netman.bin_sh" or "netman.com_sh". These
> shell scripts set up the /sys/node_data.NODE_ID directory for the
> diskless node. One of the arguments to the shell script is the node ID
> of the diskless node. You could edit the shell script to explicitly
> check the node ID before continuing on to create the diskless partner's
> directory. By refusing to create the `node_data directory, you could
> abort the attempt to boot the diskless node.

David has an excelent idea that is easily implemented. Here is a simple addition
to the "netman.com_sh" shell script that will provide the suggested control.
The first argument passed to the shell script is the node ID. By comparing this
with a list of authorized nodes, it is possible to control the boot process.
If the shell script returns with an "error" status, then netman will abort the
boot process and the diskless node will then fail in its boot up attempt.
My approach is to check the node ID against the contents of the "diskless_list"
file, and if the node ID is not found, then return with "error". Thus this script
will succeed for those nodes that it should support (the ones explicitly listed in
"diskless_list") and fail for those 'intruders' who are trying to force partner.
Here is the contents of the first few lines from my "/sys/net/netman.com_sh" script:

  #!/com/sh
  eon
  #
  #  NETMAN.RC - shell script run by netman to setup `node_data for a diskless node
  #
  #  ^1 = NODE_ID
  #  ^2 = TYPE
  #
  # First check to see if the diskless node is one that we've been authorized
  # to provide boot service for. IE it's in our "diskless_list" file. (dbf 2/13/91)
  #
  if ( /com/fpat -i "%^1" "% *^1" < /sys/net/diskless_list > /dev/null )
    then #OK, this guy's in our diskless_list
      args "Doing boot for node: ^1"
    else #choke, we don't know this one
      args "Invalid boot request by node: ^1"
      return -e
  endif
  #
  #  The following remote paging file size must agree with what the
  #  remote node actually maps (in /os/ker/ast.pas: ast_$activate)!!!! 

Note, the effects of this "error" return on the diskless node are a bit alarming
if unexpected. The diskless node will start its normal boot up procedure, it
will load the kernal, display its kernal revision number & date, and then give
a crash message with a "F0001" crash status code. This is the way that netman
aborts the diskless node boot process it is NOT a system failure. Just be aware
that you may get a few panic calls from users the first time that they get
caught by this. This will not prevent the running of other "SAU" tools by the
diskless nodes (such as "calendar" & "self_test") it will only control the OS
booting. Obviously the /sys/net directory, "diskless_list" & "netman.com_sh"
files will need to be protected from world write access to complete the
picture. When ever netman invokes this shell script, its standard out is
directed into the file "/sys/node_data/systmp/netman.out" This file can be used
for debugging modifications to the script or to check its actions.
  For this script to be used, the link "/sys/net/netman.rc" must resolve to
"netman.com_sh" and you must have Aegis ("/com") loaded on your system.
If you are a Unix-only shop, you will need to make an equivalent modification
to the script "netman.bin_sh".

  An additional modification to this script can also address a security
loop-hole that was pointed out in a previous posting by Frank: When a diskless
node boots, its `node_data/etc directory is created with open ACLs thus allowing
the world to mofiy its contents at will, including the "rc" scripts which are run
as "root" at boot time. There is also a simple cure for this problem.
First create a template directory tree "/etc/node_data" which contains
directories like "etc", "dev", "cron", and such. Then set the ACLs on these
directories as you would have them look for a properly running system. For
example, the ACL on "/etc/node_data/etc" might be:

  $ acl /etc/node_data/etc
  Acl for /etc/node_data/etc:
  Required entries
   root.%.%                         prwx-
   %.staff.%                        -r-x-
   %.%.none                         [ignored]
   %.%.%                            -r-xk
   Extended entry rights mask:      -----
(Note the "k" bit for the world)

Note that the ACLs on this template MUST be set up correctly for a properly
running system. In particular, "node_data" must be world writeable or lots of
things will break. Basically this template should have as much of a set of
directories as you want to have control on the ACLs for.
Now modify the "netman.com_sh" script and add a line:

  #  {----------- Create NODE_DATA directory and setup acls.------------}
  #
  DIR := "/sys/node_data.^1"
  if existf ^DIR then
     /etc/ulkob ^DIR -f
  else
     /com/crd ^DIR -open
  endif
  #
  # Add copy of our ACL template directory to the new target (dbf 2/13/91)
  /com/cpt /etc/node_data ^DIR -md -sacl


Because of the line "/com/cpt /etc/templates ^DIR/etc -md -sacl" that IS in
the standard Apollo supplied netman.com_sh script, you will also need to make
sure that /etc/templates has the same ACL as /etc/node_data/etc.

Dave Funk