Path: utzoo!mnetor!uunet!seismo!sundc!pitstop!sun!amdcad!ames!pasteur!ucbvax!hogg.cc.uoregon.EDU!jqj
From: jqj@hogg.cc.uoregon.EDU
Newsgroups: comp.protocols.tcp-ip
Subject: Re:  Time synchronization and distribution plan
Message-ID: <8801241627.AA04287@hogg.cc.uoregon.edu>
Date: 24 Jan 88 16:27:41 GMT
Sender: daemon@ucbvax.BERKELEY.EDU
Organization: The Internet
Lines: 34

As a matter of principal, I don't think it is appropriate to design a
time synchronization system for longterm use as strictly hierarchical.
That makes it too susceptible to Byzantine failures on the parts of
nodes high in the hierarchy, that cause problems for very large
subtrees.  It would make the time synchronization system particularly
inappropriate in a military/tactical environment, for example.
Although one may not like the specific algorithms, I prefer the Cornell
(Schneider, Toueg, etc.) approach, that attempts to achieve consensus
among a set of peers at any level.

For practical purposes, it is probably acceptable to model the system
as a hierarchy of SETS of time servers, each set having 5 to 10
members.  Presumably, algorithms can be chosen to insure that the
probablility of Byzantine failure of the whole SET is acceptably low.
However, this implies that we should design a system in which the
core/primary timeservers expect to be queried not by a large number of
mutually independent secondary servers, but by a large number of
members of sets of secondaries.  For example, we might have a set of
secondaries on a given regional network all of whom attempt to achieve
consensus among themselves but who also all query the primaries as a
time reference.  Note that it implies also that any given secondary
must plan to query several primaries (to detect Byzantine failures in
the primaries).  Correspondingly, it implies more network traffic
unless we are careful in the placement of servers.

I think this suggests at least a 3-level rather than 2-level hierarchy
of time servers, where level 3 is generally individual networks or small
groups of such networks, and level 2 is large (wellconnected) subsets of
the whole Internet.

Comments?

P.S. I would also like to see more thought given to how we should cope
with situations in which the radio timebases are inaccurate or inconsistent.