Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10.1 6/24/83; site umcp-cs.UUCP Path: utzoo!watmath!clyde!burl!ulysses!allegra!bellcore!decvax!genrad!panda!talcott!harvard!seismo!umcp-cs!pete From: pete@umcp-cs.UUCP (Pete Cottrell) Newsgroups: net.unix-wizards Subject: Re: 1-second resolution of process accounting times Message-ID: <4514@umcp-cs.UUCP> Date: Wed, 3-Apr-85 01:32:15 EST Article-I.D.: umcp-cs.4514 Posted: Wed Apr 3 01:32:15 1985 Date-Received: Fri, 5-Apr-85 02:50:41 EST References: <2228@vax4.fluke.UUCP> Reply-To: pete@maryland.UUCP (Pete Cottrell) Distribution: net Organization: U of Maryland, Computer Science Dept., College Park, MD Lines: 54 >>>Subject: 1-second resolution of process accounting times >>>(The following applies to 4.2BSD on a VAX....) >>>Well, it turns out that the accounting file /usr/adm/acct comprises a series >>>of records, one per process. A record contains various interesting data >>>about the process, including user and system CPU time. These times are >>>stored in a cute little 16-bit floating point format with a dynamic range >>>from 0 to 4.58E6 seconds. (Luckily, I don't run that many processes which >>>consume more than 5.3 CPU days.) >>>But the time is recorded in *seconds*, and is TRUNCATED by the kernel (rather >>>than rounded) when it is written. The range of the floating point format is a lot higher. See below. >>>So most times recorded in the accounting file are wildly in error. A check >>>of yesterday's accounting data shows that 10,000 out of the total 12,000 >>>processes were recorded as using zero time! I know that DEC says a VAX is >>>fast, but... Yeah, I've found that about 87% to 93% of all commands have either 0 system or user CPU seconds, and about 75% to 85% of all commands have NO time reported at all. These numbers are derived from our systems, which are an 11/780 and an 11/750. I've found that only about 75% to 80% of CPU time is reported if you report in seconds, as described above. >>>I would like to see the CPU time data recorded in a form which resolves to >>>milliseconds at the low end of the scale. It seems to me that the designer >>>of the current scheme went overboard with 13 bits in the mantissa and cut >>>himself short on exponent (only 3 bits). How about using a few more bits of >>>exponent, and recording the time in milliseconds? This would still give us a >>>couple of decimal digits of precision - and values which are meaningful for >>>those other 10,000 processes which slipped under the rug. The present format should work fine; the 13 bit mantissa gives you 8192, and the 3 bit exponent lets you left-shift this 3 places 7 times. So, the largest number representable is larger than what's in the long variable that's handed to the compress routine, and even in milliseconds, that is a long time (quick calculation yields over 49 days). Make the following change to kern_acct.c and change your accounting programs, and you're in business. 93,94c93,96 < ap->ac_utime = compress((long)u.u_ru.ru_utime.tv_sec); < ap->ac_stime = compress((long)u.u_ru.ru_stime.tv_sec); --- > ap->ac_utime = compress((long)(u.u_ru.ru_utime.tv_sec * 1000 + > (u.u_ru.ru_utime.tv_usec / 1000))); > ap->ac_stime = compress((long)(u.u_ru.ru_stime.tv_sec * 1000 + > (u.u_ru.ru_stime.tv_usec / 1000))); -- Call-Me: Pete Cottrell, Univ. of Md. Comp. Sci. Dept. UUCP: {seismo,allegra,brl-bmd}!umcp-cs!pete CSNet: pete@umcp-cs ARPA: pete@maryland