Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Posting-Version: version B 2.10.2 9/18/84; site petrus.UUCP
Path: utzoo!watmath!clyde!burl!ulysses!mhuxr!mhuxt!houxm!vax135!cornell!uw-beaver!tektronix!hplabs!intelca!qantel!lll-crg!ucdavis!ucbvax!decvax!bellcore!petrus!karn
From: karn@petrus.UUCP (Phil R. Karn)
Newsgroups: net.ham-radio.packet
Subject: TCP Programmers Guide
Message-ID: <751@petrus.UUCP>
Date: Wed, 11-Dec-85 00:55:36 EST
Article-I.D.: petrus.751
Posted: Wed Dec 11 00:55:36 1985
Date-Received: Wed, 18-Dec-85 05:52:42 EST
Organization: Bell Communications Research, Inc
Lines: 186

My work on a TCP implementation for amateur packet radio has reached the
point where I can describe the interface provided to the application by TCP.
I have also written a UDP (User Datagram Protocol); however, this is
not at the same level of maturity as TCP and is therefore more subject
to change. This note is meant primarily as "advance information" to
any implementers considering writing applications.

To review the purpose of TCP: it supports a reliable, sequenced, byte
stream "connection" on an end-to-end basis. It fits in roughly at the
Transport layer (level 4) of the OSI model. Since a single TCP module
supports multiple connections through the use of port numbers, it also
provides Session layer (level 5) functionality without the need for a
distinct protocol. (Or it makes the session layer unnecessary, depending
on your point of view). This package is written as a "module" intended to
be compiled and linked with the application(s) so that they can be run as one
program on the same machine. This greatly simplifies the user/TCP interface,
since it becomes just a set of internal subroutine calls on a single
machine. Reliability is much greater, since a hardware failure that
kills TCP will likely take any applications with it anyway. Only IP datagrams
flow out of the machine across hardware interfaces (such as asynch RS-232
ports or whatever else is available) so hardware flow control or complicated
host/front-end protocols are unnecessary.

A TCP connection is uniquely specified by the concatenation of source
and destination "sockets". In turn, a socket is the concatenation of a
host address (a 32-bit integer) and a TCP port (a 16-bit integer), defined
by the C structure

struct socket {
long address;/* 32-bit IP address */
short port;/* 16-bit TCP port */
};

Therefore it is possible to have several distinct connections established
at the same time to a single port on a given machine, as long as the
source sockets are distinct. Port numbers are used either through mutual
agreement, or more commonly when a "standard" service is involved, a "well
known port" number. For example, to obtain standard remote login service
(known as "telnet") one initiates a connection to TCP port 23; to send mail
using the Simple Mail Transfer Protocol (SMTP) one talks to port 25.
ARPA maintains port number lists and periodically publishes them. They will
also assign port numbers to a new application on request if it appears to be
of general interest.

TCP connections are best modeled as a pair of one-way paths (one in each
direction) rather than as a single full-duplex path. Station A may close
its path to station B leaving the reverse path from B to A unaffected.
B may continue to send data to A indefinitely until it too closes
its half of the connection. This is known as "graceful close" and can
greatly simplify an application.

My TCP code supports five basic operations on a connection: open, send,
receive, close and delete. A sixth, tcp_state(), is provided mainly for
debugging. They are summarized in the following section in the form of C
declarations and descriptions of each argument.

int net_error;

This global variable is used to indicate the specific cause of an error
in one of the TCP or UDP functions. All functions returning integers
(i.e., all except open_tcp) return -1 in the event of an error, and net_error
should be examined to determine the cause. The possible errors are defined
as constants in a header file.

/* Open a TCP connection */
struct tcb *
open_tcp(lsocket,fsocket,active,notify,tos)
struct socket *lsocket,*fsocket;
int active;
void (*notify)();
char tos;

"lsocket" and "fsocket" are pointers to the local and foreign sockets,
respectively.

"active" is 0 for a "passive" open (one in the TCP LISTEN state). A passive
open does not cause any packets to be sent, but enables TCP to accept a
subsequent active open from another TCP. If a specific foreign socket is
passed to a passive open, then connect requests from all other foreign sockets
will be rejected. If the foreign socket fields are set to zero, then
connect requests from any foreign socket will be accepted. If "active" is 1,
TCP will initiate a connection to a remote socket that must previously have
been created in the LISTEN state. The foreign socket must be completely
specified in an active open.

"notify" is an optional receive "upcall" mechanism, useful when running in
a non operating system environment. If "notify" is non-zero, it is taken
as the address of a function to be called whenever a "significant" amount of
data arrives. This user-provided function may then invoke recv_tcp() to
obtain the incoming data.

"tos" is the Internet "type of service" field, consisting of precedence
and class of service parameters. There are 8 levels of precedence,
with the bottom 6 defined by the military as Routine, Priority, Immediate,
Flash, Flash Override and CRITICAL. (Two more are available for internal
network functions). For amateur use we can use the lower four as Routine,
Welfare, Priority and Emergency. Three more bits specify class of service,
indicating that especially high reliability, high throughput or low delay
is needed for this connection. The entire TOS field is passed along to IP
in each datagram and is interpreted by each IP gateway (packet switch) in
the route. The precedence value actually used is the higher of those specified
in the two tcp_open() calls.

open_tcp() returns a pointer to an internal Transmission Control Block ("tcb").
This "magic cookie" must be passed back as the first argument to all other TCP
calls. In event of error, the NULL pointer (0) is returned.

The only limit on the number of TCBs that may exist at any time (i.e., the
number of simultaneous connections) is the amount of free memory on the
machine. Each TCB on a 16-bit processor takes up about 129 bytes; additional
memory is consumed and freed dynamically as needed to buffer send and receive
data. Deleting a TCB (see the delete_tcb() call) reclaims its space.

/* Send data on a TCP connection */
int
send_tcp(tcb,data,cnt)
struct tcb *tcb;
char *data;
unsigned cnt;

"tcb" is the pointer returned by the open_tcp() call. "data" points to the
user's buffer, and "cnt" specifies how long it is. The number of bytes
actually queued for transmission is returned; it will equal "cnt" unless
some sort of error occurs, such as lack of memory for buffering.
The data is copied into an internal queue, a transmission is attempted,
and the function returns so that the user may immediately reuse his buffer.
TCP uses positive acknowledgments and retransmission to ensure in-order
delivery, but this is largely invisible to the user. TCP enforces no limit
on how much data can be queued for transmission, so the user should be careful
not to run the system out of free memory. (This is something else that requires
a multitasking kernel to do right).

/* Receive data on a TCP connection */
int
recv_tcp(tcb,data,cnt)
struct tcb *tcb;
char *data;
unsigned cnt;

The arguments to recv_tcp() are identical to those of send_tcp(), except that
any data on the connection's receive queue is placed in the user's buffer,
up to a maximum of "cnt" bytes. The actual number of bytes received (the
lesser of "cnt" and the number pending on the receive queue) is returned.
Since this TCP module cannot assume the presence of sleep/wakeup primitives
provided by an underlying operating system, recv_tcp() is currently designed
to return -1 with net_error set to EWOULDBLK if no incoming data is pending.
The "notify" feature on open_tcp() is provided to eliminate the need for
constant polling of the recv_tcp() function; whenever TCP calls the notify
function, it guarantees that recv_tcp() will not return -1. (Technical
note: "notify" is called whenever a PUSH or FIN bit is seen in an incoming
segment, or if the receive window fills. It is also called before an ACK
is sent back to the remote TCP, in order to give the user an opportunity
to piggyback any data in response.)

When the remote TCP closes its half of the connection and all prior incoming
data has been read by the local user, subsequent calls to recv_tcp() return
0 rather than -1 as an "end of transmission" indicator.

/* Close a TCP connection */
close_tcp(tcb)
struct tcb *tcb;

This tells TCP that the local user has no more data to send. However, the
remote TCP may continue to send data indefinitely to the local user,
until the remote user also does a close_tcp().  An attempt to send data
after a close_tcp() is an error.

/* Delete a TCP connection */
delete_tcp(tcb)
struct tcb *tcb;

When the connection has been closed in both connections and all incoming
data has been read, this call is made to cause TCP to reclaim the space
taken up by the TCP control block. Any unread incoming data is lost.

/* Dump a TCP connection state */
tcp_state(tcb)
struct tcb *tcb;

This debugging call prints an ASCII-formatted dump of the TCP connection
state on the terminal. You need a copy of the TCP specification (ARPA
RFC 793 or MIL-STD-1778) to interpret most of the numbers.

Well, that's it. Constructive comments on this interface are welcomed.

73, Phil Karn, KA9Q