Path: utzoo!utgpu!news-server.csri.toronto.edu!mailrus!wuarchive!uunet!dg!Publius From: Publius@dg.dg.com (Publius) Newsgroups: comp.arch Subject: SPEC, SPECthruput Message-ID: <428@dg.dg.com> Date: 3 May 90 16:43:10 GMT Reply-To: publius@dg-pag.webo.dg.com (Publius) Distribution: na Organization: Performance Analysis Group, Data General, Westboro, MA. Lines: 49 SPECthruput is a performance measurement defined by SPEC. Compared to the well-known SPECmark, it represents one step further. It recognizes the importance of the multiprocessor systems in the marketplace. In many real world environments, throughput has more significance than the elapse time, and thus SPECthruput is the better performance measurement than the SPECmark. However, in its present form, SPECthruput has a few shortcomings. These shortcomings should be fixed before SPECthruput can live up to its good and noble intention. The first problem I can see in the current SPECthruput methodology is that it does not specify the maximum time slice allowed. As we all understand, the larger the time slice is, the less the context switch overhead will be. The context switch overhead here includes not only saving and restoring the process context, but also the warming up of the caches and the address translation table. Without specifying the maximum time slice would allows vendors to inflat SPECthruput numbers by setting a large maximum time slice that a real world application environment can not accept. The second problem is that what the current SPECthruput methodology measures is the BATCH THROUGHPUT, not the throughput in a time-shared environment. This has at least two implications. One is about job scheduling. The other is about cache utilization. Concerning job scheduling, there is a fundamental difference between a batch processing environment and a time-shared environment, especially on a multiprocessor system. In a batch processing enviroment, especially if all the jobs take about the same processing time (as in the case of SPECthruput methodology), the OS can assign each job a "preferred processor" and have a job run on only one processor, and enhances the throughput as the result of the reduced burden for keeping caches coherent. In a time-shared environment, processes come and go in a random manner, and the "preferred processor" scheme mentioned above won't work as well. Concerning cache utilization, there is also a difference in characteristics. In a heavily loaded time-shared system, the physical memory tends to get fragmented. It is well-known that fragmentation of physical memory can result in inefficient utilization of direct-mapped cache when the cache size is larger than the page size (or more accurately, the cluster size in memory management). SPEC has done remarkable things. Let us keep improving. -- Disclaimer: I speak (and write) only for myself, not my employer. Publius "Old federalists never die, they simply change their names." publius@dg-pag.webo.dg.com