Path: utzoo!mnetor!uunet!seismo!sundc!pitstop!sun!amdcad!ames!pasteur!ucbvax!hogg.cc.uoregon.EDU!jqj From: jqj@hogg.cc.uoregon.EDU Newsgroups: comp.protocols.tcp-ip Subject: Re: Time synchronization and distribution plan Message-ID: <8801241627.AA04287@hogg.cc.uoregon.edu> Date: 24 Jan 88 16:27:41 GMT Sender: daemon@ucbvax.BERKELEY.EDU Organization: The Internet Lines: 34 As a matter of principal, I don't think it is appropriate to design a time synchronization system for longterm use as strictly hierarchical. That makes it too susceptible to Byzantine failures on the parts of nodes high in the hierarchy, that cause problems for very large subtrees. It would make the time synchronization system particularly inappropriate in a military/tactical environment, for example. Although one may not like the specific algorithms, I prefer the Cornell (Schneider, Toueg, etc.) approach, that attempts to achieve consensus among a set of peers at any level. For practical purposes, it is probably acceptable to model the system as a hierarchy of SETS of time servers, each set having 5 to 10 members. Presumably, algorithms can be chosen to insure that the probablility of Byzantine failure of the whole SET is acceptably low. However, this implies that we should design a system in which the core/primary timeservers expect to be queried not by a large number of mutually independent secondary servers, but by a large number of members of sets of secondaries. For example, we might have a set of secondaries on a given regional network all of whom attempt to achieve consensus among themselves but who also all query the primaries as a time reference. Note that it implies also that any given secondary must plan to query several primaries (to detect Byzantine failures in the primaries). Correspondingly, it implies more network traffic unless we are careful in the placement of servers. I think this suggests at least a 3-level rather than 2-level hierarchy of time servers, where level 3 is generally individual networks or small groups of such networks, and level 2 is large (wellconnected) subsets of the whole Internet. Comments? P.S. I would also like to see more thought given to how we should cope with situations in which the radio timebases are inaccurate or inconsistent.