Path: utzoo!attcan!uunet!samsung!sdd.hp.com!apollo!apollo.hp.com!mishkin From: mishkin@apollo.HP.COM (Nathaniel Mishkin) Newsgroups: comp.protocols.misc Subject: Re: RPC Technologies Keywords: Transports UDP TCP Performance Message-ID: <1990Sep14.093420@apollo.HP.COM> Date: 14 Sep 90 13:34:00 GMT References: <1990Sep5.194621.11656@athena.mit.edu> <1990Sep7.153710@apollo.HP.COM> <142133@sun.Eng.Sun.COM> <1990Sep11.131429@apollo.HP.COM> <142319@sun.Eng.Sun.COM> Sender: root@apollo.HP.COM Reply-To: mishkin@apollo.HP.COM (Nathaniel Mishkin) Organization: Hewlett-Packard Company - Cooperative Object Computing Operation Lines: 118 In article <142319@sun.Eng.Sun.COM>, vipin@samsun.Sun.COM (Vipin Samar) writes: >Yes, this is precisely what I am interested in knowing. Does transparently >mean "random" here because any of the available protocols >give the same functionality or is there some implicit ordering? It means "random because any of the available protocols give the same functionality". Like I said, define an scheme for making a more intelligent choice among equivalent servers is an area for future work. >Yes, I agree this is a tough problem on formalizing these >heuristics - perhaps after all it is not such a bad idea to give control to >users in such cases. One can never really have all the bases covered. I definitely agree that there should be at least one mode that lets user policy choices come into play. >O.K., so you did build the entire TCP layer on top of UDP and >you added all the code for that in the NCS library. Has that step really >solved all the problems or perhaps has just moved the problems from one >place to another. Clearly there are SOME problems that just don't go away. We believe that many do. >If one cannot support huge (say 100) TCP connections because of resource >problem, then you have said that NCS/UDP will be able to >overcome the resource problem. I wonder if that is true, because >building up of entire TCP functionality on top of UDP is going to require >system resources as well. Multiplexing the file descriptors may >solve this problem to some extent, but not completely, because one would >still need to keep all those buffers hanging around for any decent >congestion control and rate-flow problems. Of course. However it is also of course true that NCS RPC's buffers (unlike those of the kernel implementation of TCP) aren't consuming wired physical memory, thus giving NCS some more room to play. And then there's the wired space in the kernel for TCP connection blocks and any other miscellaneous TCP overhead. Further, the nature of the NCS RPC protocol lets the server simply blow off its pseudo-connections to clients at any point that a call is not in progress; the pseudo-connection reestablishment happens automatically and transparently to the application (and the stubs). It is also true that even an RPC running on top of a COTP has buffer management problems and that they're arguably HARDER than RPC's over CLTP's. Note that when I say RPC/COTP I mean one that has somewhat more complicated problems than say, SunRPC/TCP. The problematic scenario is the case where for some reason, the stub is not consuming its input data stream for some period of time. In our model, we have a thread (or set of threads), distinct from the thread running the stub, that's responsible for actually reading from the connection. One reason for this is so that client-generated "cancels" (i.e., requests to abort calls) can be processed even if the stub/server thread isn't currently in the RPC runtime. Thus, there is a layer of buffering between the connection and the stub. There is finite space for this buffering. Since once you've read data from a connection, you can't discard the data (unless you want to oblige the client stubs to be smart enough to restart the call), you have the problem that you might have to be smart enough to figure that you have to stop reading from some of your open connections. But then you don't see the "cancel" message. Without boring people any more, you can more or less work around these problems, but it's by no means simple. >Also, retransmissions would be of the entire UDP packet size and not just >of the size of the MTU of the network. Or have you worked around this >problem by reducing the UDP packet size to the MTU size? I hope not. >This retransmission problem will happen only during stormy weather, but >whenever that happens, it sure will let hell break loose. >TCP backs off very graciously in such circumstances. How does your protocol >behave in these cases? We endeavor to send MTU-sized UDP packets and I don't know why you "hope not". The general wisdom as I see it seems to be that depending on IP-level fragmentation is a bad idea. Of course, it's currently hard to know the (what's called) "path MTU". (I refer people to the IETF "Path MTU Discovery" document for more details.) So we guess and try to be conservative and we eagerly await the widespread implementation of the protocol modifications required to do path MTU discovery. I think the IETF has now endorsed them. >Are all the TCP tricks on top of UDP in the kernel or in the user land? >If it is in user land, I am not sure how gratifying performance you are going >to get out of it. We have no special kernel support. (Note that there's a version of NCS that runs in the kernel, but it's only for the benefit of NCS applications that live in the kernel.) I can't give you numbers right now. (I don't have them and I don't know that I could make them public right now anyway.) (BTW, I would not be surprised to hear that NCS 1.5's bulk data throughput is lower than TCP's on some systems.) However, we believe that doing (what amounts to) a COTP in user space on top of UDP will be sufficiently rewarding. Even if it is only as fast or slight slower than TCP it is rewarding in that it has better scaling properties, which we care a lot about. Further, we're basing much our opinions on experiments that we've done and on the real-life experience the CMU ITC people with their Rx RPC system, which essentially does the same thing that NCS RPC does. In any case I take seriously my responsibility to produce a real evaluation of the wisdom of this approach once I'm in a position to do the evaluation. >Another problem with NCS approach is that the user cannot take advantages >out of any of the new performance improvements of any of the transports. >They will have to wait for the NCS team to hack those same fixes into their >software. In ONC, they need not do anything. Fair enough. On the other hand, to get the benefits of performance improvements in NCS, all you need to do is get a new NCS library and not get a whole new kernel from your vendor (as would likely be required to get any improved TCP). My experience (as both vendor and consumer) has been that getting updated user-mode software to customers is typically much easier than getting new kernels to them. -- -- Nat Mishkin Cooperative Object Computing Operation Hewlett-Packard Company mishkin@apollo.hp.com