Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!utgpu!water!watmath!clyde!rutgers!seismo!nbires!isis!onecom!wldrdg!n0ano From: n0ano@wldrdg.UUCP Newsgroups: comp.os.minix Subject: MINIX kernel profiling Message-ID: <134@wldrdg.UUCP> Date: Tue, 5-May-87 17:07:10 EDT Article-I.D.: wldrdg.134 Posted: Tue May 5 17:07:10 1987 Date-Received: Sat, 9-May-87 07:48:00 EDT Organization: Wildridge Consulting, Boulder, CO Lines: 131 After getting MINIX up and running on my PC clone I decided to find out where the kernel was spending its time. What I did was modify the clock interrupt handler to update a counter for every value of the instruction pointer when the clock interrupt occurred. (Actually, there wasn't enough data space in the kernel so counters keep track of every 16 instruction bytes - that granularity should be reasonable.) I then created a user program that reads the counters from `/dev/kmem' and correlates them with the kernel's load maps to show a profile of where the kernel is spending its time. I then profiled the kernel while re-compiling and loading all of the files in the kernel directory. This task took about 45 minutes on my Zenith 150, 4.7 MHz 8088, 704K RAM, 10 Mbyte Winchester. The listing at the end of this article shows how many clock ticks (1 tick = 16.6 msec) were spent in either user mode or each of the routines in the kernel, file system or memory manager. A few observations about the results: 1. If you add up all the ticks you get the following breakdown: User 142810 kernel 10192 fs 13286 mm 4532 Total 170820 This means that MINIX (kernel + fs + mm) only used 28010 out of 170820 ticks or about 16% of the CPU. Considering that MINIX is a message passing kernel this isn't a bad overhead figure. 2. The time in `idle' was probably time spent waiting for disk I/O to complete. I didn't measure how many disk requests were made to the winchester so I don't know if this is a reasonable amount of time. 3. The profile for the kernel itself is not as acurate as it should be. Unfortunately, MINIX spends a lot of time with all interrupts disabled. This is why so little time is shown in `phys_copy' or `cp_mess'. One of the things I want to do is change the kernel's lock/unlock mechanism to use the interrupt mask in the PIC rather than the interrupt enable bit in the flag register. This way I should be able to get better profiling of the kernel itself. Since the file system and memory manager run as tasks their profiles should be reasonably accurate. 4. One promising note is that there doesn't appear to be any major CPU hogs in the kernel. The biggest routine, sendrec, really doesn't take up that much time. It probably should be optimized if at all possible but you won't gain much if you do that. Don Dugger Wildridge Consulting ...nbires!wldrdg!n0ano user ticks = 142810 kernel file system memory manager idle( 140) = 5992 _sendrec(4F4B) = 6663 _sendrec(2668) = 3941 _sendrec(5ED2) = 2009 _zero_bl(1C57) = 2320 _do_exec( BCC) = 210 _winches(44F9) = 1034 _read_wr( CC3) = 747 _adjust( 845) = 65 _mem_tas(364C) = 418 _get_blo(390B) = 283 _size_ok( AAB) = 31 _umap(168A) = 149 _put_blo(3AEB) = 267 _main( 1B) = 31 .dvi4(63E1) = 93 _read_ma(13DD) = 257 _mem_cop(21A9) = 29 .csb2(63A5) = 72 _alloc_b(3226) = 197 _do_fork( 358) = 24 .dvu4(64B4) = 59 _get_sup(3484) = 197 _callm1(2558) = 23 .cuu(63BC) = 53 .dvi4(5474) = 193 _send(265E) = 20 .rmi4(6522) = 51 .cuu(544F) = 156 _callm3(2594) = 18 _clock_t(324E) = 48 _main( 1B) = 147 _do_brk( 7CD) = 18 .rmu4(65A1) = 43 _search_(2675) = 133 _sys_new(2468) = 17 _cp_mess(5904) = 34 .csb2(5438) = 112 _reply( EA) = 13 _vid_cop(59F0) = 21 _get_ino(35AF) = 108 _callx(25F2) = 12 _port_ou(5933) = 21 _find_fi(3EE1) = 107 _do_wait( 626) = 10 _keyboar(284B) = 18 _release(1F0E) = 102 _do_ksig(154A) = 9 _receive(5ECD) = 13 _cmp_str(4A64) = 95 _sys_cop(2489) = 8 .csa2(6387) = 13 _copy(4A9A) = 93 _sys_xit(23B2) = 5 _set_nam(5DF4) = 13 .rmi4(5558) = 79 .cmu4(2B1D) = 5 _port_in(5943) = 12 _clock_t(4A00) = 77 _free_me(1E58) = 5 _get_tot(6361) = 10 _last_di(240A) = 74 _tell_fs(2536) = 4 .mli4(6511) = 7 _rw_user(1637) = 72 _alloc_m(1DDB) = 4 _inform(15DF) = 5 _rw_bloc(3D4E) = 64 _len(2637) = 4 _portw_i(5967) = 3 _free_bi(33C5) = 57 _sys_get(23D3) = 3 _tty_tas(178C) = 1 _dev_io(2164) = 50 _get_tot(2AF7) = 3 _do_ioct(21EB) = 50 .csb2(2B37) = 3 _scale_f(350F) = 49 _do_sign(140A) = 2 .mli4(5547) = 49 _sys_exe(2447) = 2 _get_fil(3EB4) = 44 _max_hol(1FD7) = 2 _do_fork(4830) = 29 _allowed(205F) = 2 _sys_cop(4D6C) = 28 .cuu(2B4E) = 2 _pipe_ch(1DB0) = 27 _read(2365) = 2 _get_fd(3E29) = 25 _sys_for(2424) = 1 _do_sync(4715) = 25 _do_paus(1915) = 1 _rw_dev(22FF) = 25 _close(232B) = 1 _read_ah(16BD) = 24 _sig_pro(177B) = 1 _rw_inod(384A) = 21 _open(234A) = 1 _dev_clo(213D) = 21 _reply( 17B) = 20 _dup_ino(38F5) = 18 _forbidd(43F6) = 15 _len(4F1A) = 14 _do_writ(1713) = 12 _do_fsta(40CC) = 11 _fetch_n(4B33) = 11 _put_ino(3648) = 11 _advance(2545) = 10 _new_blo(1AF8) = 10 _do_lsee( BE0) = 10 _do_read( CB5) = 8 _do_clos( AE3) = 7 _truncat(2EDF) = 7 _free_zo(3D1A) = 6 _do_open( 99C) = 5 _mounted(34C9) = 5 _revive(1F98) = 5 _wipe_in(37E2) = 5 _alloc_z(3C35) = 5 _eat_pat(23C1) = 4 _do_acce(4392) = 2 _do_exit(48EA) = 2 _suspend(1EB9) = 2 _rw_supe(352A) = 2 _alloc_i(36B4) = 2 _free_in(3829) = 2 .csa2(541A) = 2 _no_call(23B2) = 1 _do_chdi(3F2E) = 1 _do_stat(4072) = 1 _do_unli(2DFE) = 1 _unload_(317E) = 1 _sys_new(4D4B) = 1