Path: utzoo!news-server.csri.toronto.edu!cs.utexas.edu!usc!zaphod.mps.ohio-state.edu!magnus.ircc.ohio-state.edu!tut.cis.ohio-state.edu!ucbvax!cs.uq.oz.au!maree From: maree@cs.uq.oz.au Newsgroups: comp.protocols.time.ntp Subject: Summary: Previous time adjustment / tickadj / loss of sync Message-ID: <3971.668671794@cs.uq.oz.au> Date: 11 Mar 91 06:09:54 GMT Sender: daemon@ucbvax.BERKELEY.EDU Distribution: inet Organization: The Internet Lines: 221 After Russell Mosemann's posting last week Re: Previous time adjustment > The error message is printed when xntpd does one of its every-4-seconds > adjustments and then finds that not all of the previous adjustment had > been done yet. > I talked to Sun and found out that an early version of SUNOS 4.1.1 > had a bug in adjtime() which would not complete adjustments under > certain circumstances. For me, it happened when one of the programs was > giving the system a run for the money. > Anyway, the part number of my tape is 700-2687-10 Rev. A. The next > revision should have the fix in it. The bug number is 1036401, in case > anyone is interested. I pursued the problem with Sun. Thanks to Sun (both Australia and U.S.) for a very fast response. The bug-report (#1036401) refers to adjtime() and date -a being ineffective. The bug is fixed in 4.1.1, there is no patch. The tickadj bug (details at end of this posting) explain Alan Young's problem: > Subject: ntp & Sparcstation running 4.0.1 -- loss of sync > From: Alan Young > Running ntpd (manual page dated 15 June 1989) on a Sun4 sparcstation > running SunOS 4.0.1c I can get in sync with another host and get the > offset down to something reasonable (< 100 ms) with a dispersion in the > thousands. Then the dispersion suddenly goes out to 64000 and we are > out of sync for a couple of minutes. This cycle repeats about every 5 > minutes. The offset often goes out to as much as 5s. Someone suggested > that this may be a know problem (with solution): any ideas? > But.... it appears that the "Previous time adjustment didn't complete" "bug" has nothing to do with the tickadj bug. The following and the Bug report below are reprinted with permission. ------------------------ Bug Id: 1053351 Category: kernel Subcategory: syscall Bug/Rfe: bug State: closed Synopsis: adjtime() does not work well with xntpd This is not a kernel or syscall bug. The behavior of adjtime(2) in this case is consistent with the definition of its proper behavior as described in the adjtime(2) man page: The adjustment is effected by speeding up (if that amount of time is positive) or slowing down (if that amount of time is negative) the system's clock by some small percentage, gen- erally a fraction of one percent. Thus, the time is always a monotonically increasing function. A time correction from an earlier call to adjtime() may not be finished when adj- time() is called again. If olddelta is not a NULL pointer, then the structure it points to will contain, upon return, the number of microseconds still to be corrected from the earlier call. If olddelta is a NULL pointer, the corresponding information will not be returned. The error message noted in the bug report is generated in xntpd/ntp_unixclock.c: if (adjtime(&adjtv, &oadjtv) != 0) syslog(LOG_ERR, "Can't do time adjustment: %m"); if (oadjtv.tv_sec != 0 || oadjtv.tv_usec != 0) { syslog(LOG_ERR, "Previous time adjustment didn't complete"); The "error" case is, in fact, a perfectly legitimate result of the adjtime() call -- a previous adjustment hasn't had time to complete. The program should not assume that the previous time adjustment has completed. --------------------- Here's the bug report: Category: kernel Subcategory: syscall Release summary: 4.1, 4.0.3, 4.1_psr_a, 4.0, 4.1.1-alpha2 Bug/Rfe: bug State: closed Synopsis: adjtime() and date -a are ineffective Keywords: ineffectiv, -a, date, adjtime(), adjtime Severity: 2 Priority: 3 Description: When the system time is adjusted via adjtime() (as invoked by date -a), the time changes to the correct time. However, within a minute, the system automatically generates an opposite adjustment. This occurs since hardclock fails to ever call doresettodr() to set the rtc and when synctodr() is called, the opposite adjustment pushes the system time back. The description field as copied from bug report 1038434 follows: When the system clock is changed via an "adjtime" system call, the contents of the TOD chip are never modified to reflect the adjusted time; at the next reboot, the adjustment disappears. The description field as copied from bug report 1045448 follows: when using date -a, the 4.1 server increments the date to the required level. Then the date decreases to it's old level. It seems to work on 4.1.1 Beta, but I was not able to test it on a 4/[34]90 as there is no 4.1.1 PSRA Beta. Setting the date with date works as it should. The description field as copied from bug report 1045516 follows: Customer tries to use the adjtime(2) system call to syncronize the time between several machines each night. If he determines that a machine is 20 seconds slow, he tries to move it forward 20 seconds. This takes a couple of minutes and does work. Some time later, the customer notices that the time is, once again, 20 seconds slow. He has been able to determine that "something" in the system (after the 20 second advance has worked) is plugging negative values into the adjtime(2) call and moving the clock back to its original value. The negative values being put in are in the -2 to -3 range each time. He was seeing this by putting in 20, waiting a period of time, and then putting 0 into adjtime. This puts 0 into the register, and returns the value that was there already. The return value at increasing times returns decreasing values as they approach zero, then they start becoming negative values. The same kind of thing happens if he tries to adjust the time backwards, it works, and then works its way forward again. Work around: Setting the date rather than adjusting does work. However, this causes either time gaps or repetition of time intervals. The work around field as copied from bug report 1038434 follows: When a permanent change to the system time is required, set the time explicity with date or synchronize to a server with rdate. If synchronizing the system clock to an external standard, as when using NTP, the logic for slaving the software time to the TOD chip time should be disabled: # adb -w /vmunix dosynctodr?W 0 $q # The work around field as copied from bug report 1045448 follows: Use date instead of date -a. The work around field as copied from bug report 1045516 follows: use the date command, but this will disrupt the time continuum on the system. Suggested fix: The suggested fix field as copied from bug report 1038434 follows: Repair the logic for calling resettodr in kern_clock.c State triggers: Evaluation: Yup. The problem is true as stated: if (timedelta == 0) { BUMPTIME(&time, tick); } else { register delta; if (timedelta < 0) { delta = tick - tickdelta; timedelta += tickdelta; } else { delta = tick + tickdelta; timedelta -= tickdelta; } BUMPTIME(&time, delta); if (-tickdelta < timedelta && timedelta < tickdelta) { timedelta = 0; if (doresettodr) { if (doresettodr == 1) doresettodr = time.tv_sec; if (doresettodr != time.tv_sec) { doresettodr = 0; resettodr(); } } } } When the timedelta drops enuf to be zeroed, then doresettodr is set to cause the clock chip to be reset at the next second tick. Unfortunately, that is never reached. The next time that synctodr() runs after timedelta gets zeroed, the time is adjusted back to where it would have been without the adjtime(2) call. The evaluation field as copied from bug report 1038434 follows: 14may90 limes -- first, thanx to steve chessin for finding this. in sys/os/kern_clock.c:hardclock() ... When a correction has been applied via adjtime() [timedelta is nonzero and doresettodr is nonzero], and the correction has come to its closest point, the current time is remembered so we can set the chip clock just as we tick over the next second. The code that notices that we are ticking over the next second is not in fact ever reached, as it is contained inside the conditional for "timedelta != 0", and we have already cleared the timedelta value. The evaluation field as copied from bug report 1045448 follows: We believe this to be a duplicate of 1036401, which has been fixed in 4.1.1. Commit to fix in releases: 4.1.1-beta1 Fixed in releases: 4.1.1-beta1 Integrated in releases: 4.1.1-beta1 Verified in releases: 4.1.1-beta2 Closed because: fixed verified Public Summary: The adjtime() function and date(1) -a option are only temporarily effective, and the system immediately undoes the adjustment when the system clock becomes correct. Hook 2: Needs investigation in release: Bug End: ------------------------------------------------------------------ Maree Hegarty maree@cs.uq.oz.au Computer Science, University of Queensland, 4072, Australia Ph: +61 7 365 2864 Fax: +61 7 365 1999 -------------------------------------------------------------------