Path: utzoo!utgpu!jarvis.csri.toronto.edu!rutgers!tut.cis.ohio-state.edu!pt.cs.cmu.edu!andrew.cmu.edu!+ From: Richard.Draves@CS.CMU.EDU Newsgroups: comp.os.mach Subject: Re: Context switches? Or something else Message-ID: Date: 15 Oct 89 04:17:28 GMT References: <33153@cornell.UUCP> Sender: rpd@M.GP.CS.CMU.EDU Distribution: comp Organization: Carnegie Mellon, Pittsburgh, PA Lines: 48 In-Reply-To: <33153@cornell.UUCP> > Excerpts from netnews.comp.os.mach: 13-Oct-89 Context switches? Or > someth.. Ken Birman@gvax.cs.corne (766) > Has anyone done benchmarks comparing the cost of UDP, "UNIX domain" > TCP and Mach native communication for intra-machine and inter-machine > cases? I would be very interested in seeing the figures... especially > if they covered scatter gather as well. Mach IPC doesn't provide for scatter/gather in the sense of readv/writev. Mach messages can contain "out-of-line" segments of memory, which are transferred with copy-on-write VM technology. They pop up somewhere in the receiver's address space (somewhere that isn't being used, of course). There is no way for the receiver to say where they should go. The receiver can use vm_write or vm_copy to move the memory elsewhere, I suppose. I've never seen it done. Here are some measurements of local (between two tasks on a single machine) RPC performance. I used a Sun-3/60. For the Mach RPC, I used a Mig interface which sends an integer in the request and returns an integer in the reply. The reply message is actually bigger than the request, because Mig also includes an error code in the reply. The important point is that these times include the overhead of the Mig stubs packing/unpacking the messages. The client uses msg_rpc and the server uses msg_receive/msg_send. The server does the receive on a port set. (Port sets can contain multiple receive rights. They're like select in that they let a single thread receive from multiple communication channels. Most Mach servers use them.) In three trials of 10000 iterations each, I got 1.272, 1.298, 1.298 msecs/RPC. For the Unix RPC, I used the same Mig-generated stubs to pack/unpack messages, so that overhead remains the same and the same number of bytes are moving through the kernel. I used PF_UNIX/SOCK_STREAM sockets. The client used write/read; the server used select/read/write. (So an RPC took five system calls instead of three as in the Mach case.) In three trials of 10000 iterations each, I got 3.010, 3.048, 3.038 msecs/RPC. The select was only given one file descriptor; I don't know how that affects it. (Port sets scale properly. The time to receive from a port set with hundreds of ports is the same as the time to receive from a single port.) When I removed the select, I got 2.436, 2.436, 2.432 msecs/RPC. Better, but in my experience Unix servers (like X) tend to use select. Please let me know if there is some faster way to use sockets. Rich