Path: utzoo!utgpu!water!watmath!clyde!att!rutgers!mit-eddie!bu-cs!madd
From: madd@bu-cs.BU.EDU (Jim Frost)
Newsgroups: comp.unix.questions
Subject: Re: What is 'sockets' ??? (A Simple Tutorial)
Message-ID: <24549@bu-cs.BU.EDU>
Date: 24 Aug 88 23:04:47 GMT
References: <916@altger.UUCP> <913@buengc.BU.EDU>
Reply-To: madd@bu-it.bu.edu (Jim Frost)
Followup-To: comp.unix.questions
Organization: Boston University Distributed Systems Group
Lines: 180

In article <913@buengc.BU.EDU> bph@buengc.bu.edu (Blair P. Houghton) writes:
|In article <916@altger.UUCP> amigaeb@altger.UUCP (Ronny Hansen) writes:
|>I am trying to learn about socket's, but I cant find anything
|>to learn from. No books. No magazines. No nothing.
|
|Look for "A 4.2BSD Interprocess Communication Primer" [...]
|Pure literary review:  it's one of the hardest things to read I've ever
|read.  It ain't the material, either.  It's just a style problem.
|Well, noone said computerz was e-z...

I found that it was nice for info once I understood what was happening
but it's not the kind of thing to unleash on a beginner.  Here's my
quick primer.

There are only a couple of UNIX commands which take care of handling
sockets.  These are:

	socket()
	bind()
	connect()
	accept()
	read()
	write()
	close()

close() does exactly what you'd expect.

socket() is used to create a file descriptor that is used to refer to
a new socket.  You have to tell it what kind of socket you are using,
usually SOCK_STREAM or SOCK_DGRAM.  TCP/IP connections are
SOCK_STREAM; all of this info will be based on using stream-type
sockets.

The bind() command is used to bind the local address to the socket,
much like establishing the phone number of the phone you're using.

connect() calls another system to bind the far address to the socket,
much like dialing the phone.  accept() is used by the other end of the
connection; it's analogous to someone picking up the phone.  It binds
the caller's address to the socket.  Once both addresses are known to
the socket, data can flow.

An example routine that creates a socket and calls to another is:

  int hopen(hostname)
  char *hostname;
  { struct sockaddr_in sa;
    struct hostent     *hp;
    int a, sock;

    /* find host table entry
     */

    if((hp= gethostbyname(hostname)) == NULL) {
      errno= ECONNREFUSED; /* return some reasonable error */
      return(-1);
    }

    /* get communications protocol
     */

    bzero(&sa,sizeof(sa));
    if (getprotobyname("tcp") == NULL) {
      errno= ENOPROTOOPT;
      return(-1);
    }

    /* set up local address (host and port number)
     */

    bcopy(hp->h_addr,(char *)&sa.sin_addr,hp->h_length);
    sa.sin_family= hp->h_addrtype;
    sa.sin_port= htons((u_short)PORTNUM);

    /* create socket and do connection
     */

    if ((sock= socket(hp->h_addrtype,SOCK_STREAM,0)) < 0) /* get socket */
      return(-1);
    if (connect(sock,&sa,sizeof sa) < 0)                 /* connect */
      return(-1);
    return(sock);
  }

To accept a connection, things are a little different.  The following
two functions do it:

  int get_connection() {
    struct sockaddr_in isa;
    int i, sock;

    if ((s= establish(PORTNUM)) < 0) {
      return(-1);

    i = sizeof(isa);                   /* find socket's "name" */
    getsockname(sock,&isa,&i);

    if ((t = accept(sock,&isa,&i)) < 0)
      return(-1);
    return(sock);
  }

  /* code to establish a socket; originally from bzs@bu-cs.bu.edu
   */

  int establish(portnum)
  u_short portnum;
  { char   myname[MAXHOSTNAME+1];
    int    s;
    struct sockaddr_in sa;
    struct hostent *hp;

    gethostname(myname,MAXHOSTNAME);            /* who are we? */
    bzero(&sa,sizeof(struct sockaddr_in));
    hp= gethostbyname(myname);                  /* get our address info */
    if (hp == NULL)                             /* we don't exist !? */
      return(-1);
    sa.sin_family= hp->h_addrtype;              /* set up info for new socket */
    sa.sin_port= htons(portnum);
    if ((s= socket(AF_INET,SOCK_STREAM,0)) < 0) /* make new socket */
      return(-1);
    if (bind(s,&sa,sizeof sa,0) < 0)
      return(-1);                               /* bind socket */
    return(s);
  }

Usually the accept() call is done in a loop and programs are forked
off to handle the connected sockets.  This is a little more complex so
I'm not throwing in the code here.

After you have a socket you have to dump things through it.  Sockets
are bidirectional so either end can read or write to it.

Writing to a socket is done through the normal file I/O functions.  I
recommend checking to make sure the write actually wrote all that you
told it to; more in read().

Reading is handled the same way as normal WITH ONE EXCEPTION.  People
expect a read() to return the amount of data that it read, but usually
they also expect it to read as much as they tell it to.  This is almost
always not the case with sockets.  The correct way to read a specific
amount of information is to keep reading in a loop until you hit some
limit.  This is the function I use to read "n" characters into a buffer:

  int hread(s,buf,n)
  int  s;
  char *buf;
  int  n;
  { int bcount,                      /* counts bytes read */
        br;                          /* bytes read this pass */

    bcount= 0;
    br= 0;
    while (bcount < n) {             /* loop until full buffer */
      if ((br= read(s,buf,n-bcount)) > 0) {
        bcount += br;                /* increment byte counter */
        buf += br;                   /* move buffer ptr for next read */
      }
      if (br < 0)                    /* signal an error to the caller */
        return(-1);
    }
    return(bcount);
  }

I recommend using a very similar routine to handle writes.

If you forget the looping part you end up losing characters.  It took
me quite some time to track this down the first time I was writing
networking code and I'd like to help newcomers avoid that.

This ought to give you enough information to start using sockets for
networking applications.  It's not complete by any means since it
deals with only TCP/IP stream connections and ignores datagram-style
connections but most of the ideas are similar except datagram
connections (at least UDP) do not guarantee delivery of datagrams and
things get much tougher.  Take a look at real applications for other
examples.

jim frost
madd@bu-it.bu.edu