Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Posting-Version: version B 2.10.2 9/18/84; site peora.UUCP
Path: utzoo!watmath!clyde!burl!ulysses!bellcore!decvax!decwrl!pyramid!pesnta!peora!jer
From: jer@peora.UUCP (J. Eric Roskos)
Newsgroups: net.mail
Subject: Re: Pathalias/uumail: some algorithms and questions
Message-ID: <1954@peora.UUCP>
Date: Mon, 3-Feb-86 10:12:13 EST
Article-I.D.: peora.1954
Posted: Mon Feb  3 10:12:13 1986
Date-Received: Wed, 5-Feb-86 01:34:26 EST
References: <122@delftcc.UUCP>
Organization: Concurrent Computer Corporation, Orlando, Fl
Lines: 96

> I think the algorithms that underly a pathalias-routing program such as
> Stan Barber's uumail are worth discussing, independent of their
> implementation.  What we need is a sort of "Guide to Using the Pathalias
> Database".

First of all, by way of clarification, I wrote the "opath" routing code in
Stan's uumail (though I am very grateful to him for including it in his
program; he also cleaned up a number of things).  Actually I had a lot of
reasons for the approaches used; let me comment in response to your
observations on why I did various things (most of all, leaving things
alone sometimes).

> Okay, first question.  Given a path
>
>         a!b!c!...!p!q!r!stuff@z
>
> which name(s) do we try to look up?  Basically, either (1) "a", or (2)
> "r", or (3) "z".  We can refine these choices a bit: (1) if we don't
> find "a" in the database, we try "b", then "c", and so on; likewise, (2)
> if we don't find "r", we try "q", then "p", and so on.  Also, (3) if we
> don't find "z", graduate to option (1) or (2).  "uumail" lacks these
> refinements, I think, and it also has no option (2).

My algorithm was to look up "a", iff "a" is not a neighbor. (In the
original version of the program, I had a table of neighbors so that the
database didn't have to be referenced in that case; but since the path for
neighbors is just the name of the neighbor, and since obtaining the table
in a secure manner required either hardcoding the table or calling a
setuid program, I eventually eliminated that and just looked up all the
names.) "z" definitely should not be looked up; that is the essence of
what I've been arguing for for a long time now, viz., that the string
representing the path should consist of "names interpreted at the next
site" separated by "!"s, with no characters other than "!" significant (at
that level of the parsing; however, a given "name interpreted at the next
site" can be further parsed by the site it was intended for).

I have mixed feelings about looking up "r".  I definitely don't think it
should be looked up if there is any alternative; a lot of AT&T mailer
sites do that, and it causes a lot of trouble from time to time when you
want to explicitly specify a path, for whatever reason, and then a mailer
down the line "optimizes" it.  I do think it is somewhat more reasonable to
rewrite the path if you're a site such as the gateways in Europe which
have to choose between low-cost packet networks and high-cost conventional
telephone connections, though.

The algorithm of trying successive sites down the path if the next one
is unknown is an interesting improvment, though.  The problem is, since
site names are not unique, if you don't know some of the names that make
up the context of one you do know, you may end up choosing the wrong one.
Peter Honeyman's new "pathparse" program might be better to use for this.

> Second question.  Give an algorithm for mapping a name into a path.

This is well-defined in RFC822.  The current "opath" routines let you
"cheat" somewhat on this, since you have to explicitly specify what
".GIZMO" means as distinct from ".GIZMO.UUCP".

>   (3) Prepend a "." to the name, and look it up.  This is to cover
>       things like "larry.rosler@ATT.UUCP" or "sob@harvard.edu";
>       "ATT.UUCP" and "harvard.edu" are domains, not actual hosts, but
>       it can still make sense to send things to them.  (I'm not sure
>       that these addresses actually work.)

First of all, if you look in the latest distribution of the UUCP map,
you'll find that the map folks have already started implementing the
domains (including, alas, geographic subdomains), they're just commented
out.  For UUCP routing, in my opinion, a domain name (e.g., "ATT.UUCP")
does map to one or more site names; actually the opath code (I think
in the verison Stan used) lets you choose from among a number of
alternative sites when resolving the domain name, with a weighting, so
that you can have several different nameservers for a domain, and you can
route to more than one of them, with the frequency of routing weighted
in proportion to how much you want to send to each of them.

> This assumes domain names have an initial dot, as in the latest version
> of pathalias.  I don't know if this makes "uumail"'s domain table
> obsolete; the domain table is more flexible than domains handled through
> pathalias, but it would be much more elegant and convenient to handle
> domains entirely through pathalias.  Certainly pathalias's domain
> handling (when combined with the algorithm above) is sufficient for my
> needs, but my site is UUCP-only.

I've been thinking about this a lot the past few days.  For the present,
you can use ".ATT", etc., in place of the gateway names in the ">gateway"
field of the routing table.  I haven't decided yet what the relative
merits of the two approaches (other than the probabilistic routing) are.

Well, I could go on at length, but our system is going down for maintenance,
so I'll leave it at that for now...
-- 
UUCP: Ofc:  jer@peora.UUCP  Home: jer@jerpc.CCUR.UUCP  CCUR DNS: peora, pesnta
  US Mail:  MS 795; CONCURRENT Computer Corp. SDC; (A Perkin-Elmer Company)
	    2486 Sand Lake Road, Orlando, FL 32809-7642     xxxxx4xxx

	"There are other places that are also the world's end ...
	 But this is the nearest ... here and in England." -TSE