Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!rutgers!apple!well!nagle
From: nagle@well.UUCP (John Nagle)
Newsgroups: comp.protocols.tcp-ip
Subject: Re: Nagle algo. in Unix-TCP
Keywords: tcp tinygram RPC X11
Message-ID: <13394@well.UUCP>
Date: 29 Aug 89 16:18:35 GMT
References: <2581@lll-lcc.UUCP>
Reply-To: nagle@well.UUCP (John Nagle)
Lines: 188


In article <2581@lll-lcc.UUCP> rwolski@lll-lcc.UUCP (Richard Wolsoi) writes:
>
>My question regards the Nagle algorithms for small-packet avoidance, which
>have been implemented in the various flavors of UNIX now running around.  
>
>A colleague of mine has written an RPC mechanism to run over TCP sockets
>on UNIX systems and we are seeing some very strange performance numbers
>for certain kinds of messaging.  An ethernet trace and several minutes with
>the source code convinced us that TCP was delaying both data sends and
>acknowledgements in an effort to avoid silly windows.  
>
>Part of the$problem comes from the RPC implementation which makes two
>send system calls for each request or reply (no flames please we are dealing 
>with some serious history here) in that UNIX sends out the information in
>two different packets.  The first goes out immediately (as the Nagle
>algorithm prescribes) while the second is delayed until the first is acked.
>Unfortunately, the ack is delayed as part of the receiver's half of the 
>bargain and so we were seeing a whopping 10 packets per second.
>Now for my question.  Is there any way to defeat the Nagle algorithms under
>standard implementations?  I seem to recall that the Tahoe (or is it Tajo) 
>release of 4.3 had such a defeat wlich was put in for X-11, but we don't 
>have systems which are quite so modern.  Specifically, we are using SunOS
>3.4, Ultrix 3.0 and UTS something-or-other with WIN-UTS.  

       What seems to have happened here is that several mechanisms in TCP
are interacting with a strange kind of application to produce poor performance.

       First, "silly window syndrome" is irrelevant here.  Silly window
syndrome occurs when the window is full most of the time, but here,
we have the tinygram problem, which occurs when the window is empty most
of the time.  It's a common misconception that the two problems are the
same.  They are not.  They are handled by separate code in the UNIX
implementation, incidentally.

       The problem here seems to be that tinygram elimination and
delayed ACKs, both performance improvements in TCP, are interacting
with this new RPC application.  Delayed ACKs
are something that first appeared in TOPS-20, which is a system that
runs TELNET in remote echo most of the time.  (So do most UNIX TELNET
implementations, not that they really need to.)  The idea there was
to make the bet that when a packet came in on a TCP connection, the
application would probably have something to reply with shortly.
Therefore, TCP was gimmicked to delay sending an ACK for about 100ms,
in hopes that the application layer would send something back and that
this would be piggybacked on the application layer's reply.  Note that
this is an assumption built into the transport layer about the behavior
of the application layer.  

        Delayed ACKs will cut traffic in half on slow TELNET operations,
and they don't bother FTP.  So they seemed like a big win at the time,
when there were few TCP applications beyond TELNET, FTP, and mail.

        TCP with delayed ACKs and tinygram elimination sending will support 
the following kinds of applications well.

	1) Big data pipes, like FTP.
	2) TELNET-type interaction.
	3) Request-reply type transaction protocols.

But here, we have someone who is trying something that has the property
that it does

	SEND
	SEND
	wait for reply.

This doesn't take turns the way a typical transaction protocol does, so
the guesses built into TCP are bad for this situation.  This, this sort of use
creates problems.  Of course, as the writer points out, doing multiple
tiny sends is a bad thing on general principles.  It's always better to
do one big send than lots of little ones, given that you're not waiting
anxiously for an answer.  Sending a 1-byte message takes 41 bytes across
the net, plus any overhead at the link layer.  It's that 40:1 expansion
that led to the need for tinygram elimination.  Presumably this RPC
package is sending a bit more data at a time, so the expansion factor
may be smaller, but it may still be significant.

One solution might be to make whatever RPC package he's using go through
the standard UNIX I/O library (stdio) and flush the output stream just
before reading from the input stream or before reading from another
source.  This would improve the buffering situation.  This is really a
buffering problem, after all.

TCP is trying to protect the network from dumb applications.  We fixed it
back in 1985 so that when the application is dumb, the application suffers, 
not the network.  We have here a dumb application layer.

					John Nagle
Newsgroups: poster
Subject: Re: Nagle algo. in Unix-TCP
Summary: 
Expires: 
References: <2581@lll-lcc.UUCP>
Sender: 
Reply-To: nagle@well.UUCP (John Nagle)
Followup-To: 
Distribution: 
Keywords: 

In article <2581@lll-lcc.UUCP> rwolski@lll-lcc.UUCP (Richard Wolsoi) writes:
>
>My question regards the Nagle algorithms for small-packet avoidance, which
>have been implemented in the various flavors of UNIX now running around.  
>
>A colleague of mine has written an RPC mechanism to run over TCP sockets
>on UNIX systems and we are seeing some very strange performance numbers
>for certain kinds of messaging.  An ethernet trace and several minutes with
>the source code convinced us that TCP was delaying both data sends and
>acknowledgements in an effort to avoid silly windows.  
>
>Part of the$problem comes from the RPC implementation which makes two
>send system calls for each request or reply (no flames please we are dealing 
>with some serious history here) in that UNIX sends out the information in
>two different packets.  The first goes out immediately (as the Nagle
>algorithm prescribes) while the second is delayed until the first is acked.
>Unfortunately, the ack is delayed as part of the receiver's half of the 
>bargain and so we were seeing a whopping 10 packets per second.
>Now for my question.  Is there any way to defeat the Nagle algorithms under
>standard implementations?  I seem to recall that the Tahoe (or is it Tajo) 
>release of 4.3 had such a defeat wlich was put in for X-11, but we don't 
>have systems which are quite so modern.  Specifically, we are using SunOS
>3.4, Ultrix 3.0 and UTS something-or-other with WIN-UTS.  

       What seems to have happened here is that several mechanisms in TCP
are interacting with a strange kind of application to produce poor performance.

       First, "silly window syndrome" is irrelevant here.  Silly window
syndrome occurs when the window is full most of the time, but here,
we have the tinygram problem, which occurs when the window is empty most
of the time.  It's a common misconception that the two problems are the
same.  They are not.  They are handled by separate code in the UNIX
implementation, incidentally.

       The problem here seems to be that tinygram elimination and
delayed ACKs, both performance improvements in TCP, are interacting
with this new RPC application.  Delayed ACKs
are something that first appeared in TOPS-20, which is a system that
runs TELNET in remote echo most of the time.  (So do most UNIX TELNET
implementations, not that they really need to.)  The idea there was
to make the bet that when a packet came in on a TCP connection, the
application would probably have something to reply with shortly.
Therefore, TCP was gimmicked to delay sending an ACK for about 100ms,
in hopes that the application layer would send something back and that
this would be piggybacked on the application layer's reply.  Note that
this is an assumption built into the transport layer about the behavior
of the application layer.  

        Delayed ACKs will cut traffic in half on slow TELNET operations,
and they don't bother FTP.  So they seemed like a big win at the time,
when there were few TCP applications beyond TELNET, FTP, and mail.

        TCP with delayed ACKs and tinygram elimination sending will support 
the following kinds of applications well.

	1) Big data pipes, like FTP.
	2) TELNET-type interaction.
	3) Request-reply type transaction protocols.

But here, we have someone who is trying something that has the property
that it does

	SEND
	SEND
	wait for reply.

This doesn't take turns the way a typical transaction protocol does, so
the guesses built into TCP are bad for this situation.  This, this sort of use
creates problems.  Of course, as the writer points out, doing multiple
tiny sends is a bad thing on general principles.  It's always better to
do one big send than lots of little ones, given that you're not waiting
anxiously for an answer.  Sending a 1-byte message takes 41 bytes across
the net, plus any overhead at the link layer.  It's that 40:1 expansion
that led to the need for tinygram elimination.  Presumably this RPC
package is sending a bit more data at a time, so the expansion factor
may be smaller, but it may still be significant.

One solution might be to make whatever RPC package he's using go through
the standard UNIX I/O library (stdio) and flush the output stream just
before reading from the input stream or before reading from another
source.  This would improve the buffering situation.  This is really a
buffering problem, after all.

TCP is trying to protect the network from dumb applications.  We fixed it
back in 1985 so that when the application is dumb, the application suffers, 
not the network.  We have here a dumb application layer.

					John Nagle