Xref: utzoo comp.sys.att:10963 unix-pc.general:6562 Path: utzoo!utgpu!watserv1!watmath!uunet!shelby!apple!portal!cup.portal.com!thad From: thad@cup.portal.com (Thad P Floryan) Newsgroups: comp.sys.att,unix-pc.general Subject: Are 3B1 "pipes" really slower than molasses? Message-ID: <36256@cup.portal.com> Date: 27 Nov 90 06:26:14 GMT Organization: The Portal System (TM) Lines: 291 Yet another chapter in the saga of the ongoing "Don't shoe-shine MY data!" While investigating why the tape backup operation on the 3B1 is so s-l-o-w, even with double-buffering techniques, I finally pinpointed what appears to be the cause: PIPES. Pipes are used to transfer data to "tapecpio" in all the supplied shell scripts, and pipes are typically used to pass data from a "find" (i.e. "find * -print | cpio -oc > whatever"). "Piping" was the ONLY thing in common with all my testing, so I decided to instrument some pipe runs and see what gives. Seems the 3B1 pipes leak bits out into the Great Bit Bucket or sumtin'. This is the first time I've ever had something "bad" to say about the 3B1. And this "problem" affects more than just backups, it affects ANYTHING using pipes, so this should be of interest to you no matter what system you're using. Specifically: the BEST performance observed is approx. 35 KBytes/Second between two processes which are piped together. Adding more "drains" to the "pipe" worsens performance. I tested 4 UNIXPC systems, ranging from 4MB RAM/85MB HD to 1MB RAM/10MB HD, and the results are all in the same ballpark: 35-36 KBytes per second. Perhaps there's something I'm just not seeing, or perhaps some "ktune" params are not obvious. I'm working on the assumption that "pipes" are a performance bottleneck on the UNIXPC and so I went and grabbed some tape utils from site wsmr-simtel20.army.mil to see if a non-piped tape backup/restore program can improve performance. This will take some time to checkout, so in the meantime here are two things I'm asking: 1) Enclosed are my test programs, a Makefile, and a shell-script to exercise the tests. Try them on your system. If the results are substantially different, please post them along with your present "ktune" parameters (you get these by: "su; ktune -d"). By results "substantially different" I mean you're getting 200 KBytes/Sec or something else radically different from my results (below). 2) If you know of ways to improve pipe performance, please post them. I don't recall any discussions of this "problem" mentioned in this newsgroup before, so maybe I've opened a new "can-of-worms" here; wouldn't be the first time and definitely won't be the last! :-) Enclosed with this posting is a "shar" of my test suite. You may need to change the "gcc" in the Makefile to be "cc", but I tried both with no change in the observed performance. If nothing else, you may find the timing code in "recv.c" interesting. To run the tests, do either: $ ./test.sh (OR) $ nohup ./test.sh & That second form places its output in a file named "nohup.out". In all cases, the output will look something like: $ ./test.sh send | recv 100000 characters received in 2.783 seconds for 35928 CPS 200000 characters received in 5.833 seconds for 34285 CPS 300000 characters received in 8.350 seconds for 35928 CPS 400000 characters received in 11.933 seconds for 33519 CPS 500000 characters received in 14.100 seconds for 35460 CPS 1000000 characters received in 28.200 seconds for 35460 CPS send | pass | recv 100000 characters received in 5.566 seconds for 17964 CPS 200000 characters received in 10.333 seconds for 19354 CPS 300000 characters received in 16.200 seconds for 18518 CPS 400000 characters received in 21.200 seconds for 18867 CPS 500000 characters received in 26.733 seconds for 18703 CPS 1000000 characters received in 53.050 seconds for 18850 CPS If you see any flaws in my testing techniques, I'd appreciate knowing about them, too. But I've checked this out quite thoroughly and I'm convinced that what I'm seeing with the results (above) is the actual piping throughput. The "ktune" parameters on my systems are (the comments are my annotations): # ktune -d nbuf 100 #number of system buffers for block devices ninode 400 #number of memory-resident inodes at one time nfile 300 #number of files open on system at one time nproc 100 #number of processes existing at one time ntext 75 #number of text structures allocated in kernel nclist 150 #number of clist buffers available npbuf 16 #number of buffer headers in the raw I/O pool ncall 32 #number of callouts allowed in the kernel nttyhog 1024 #number of chars in tty buffers before implicit flush Some other systems I've already tested with the same suite include (with the results for 1,000,000 chars in both tests rounded to nearest 1000): HP-9000/840 (Spectrum RISC), HP-UX 3.01, 240000 CPS and 120000 CPS HP-9000/350 (Motorola 68030), HP-UX 7.0, 156000 CPS and 85000 CPS Thad Thad Floryan [ thad@cup.portal.com (OR) ..!sun!portal!cup.portal.com!thad ] ---- Cut Here and unpack ---- #!/bin/sh # This is a shell archive (shar 3.32) # made 11/27/1990 05:18 UTC by thad@thadlabs # Source directory /u/thad/Filecabinet/WORK/pipe-test # # existing files WILL be overwritten # # This shar contains: # length mode name # ------ ---------- ------------------------------------------ # 485 -rw-r--r-- Makefile # 247 -rw-r--r-- pass.c # 824 -rw-r--r-- recv.c # 332 -rw-r--r-- send.c # 411 -rwxr-xr-x test.sh # if touch 2>&1 | fgrep 'amc' > /dev/null then TOUCH=touch else TOUCH=true fi # ============= Makefile ============== echo "x - extracting Makefile (Text)" sed 's/^X//' << 'SHAR_EOF' > Makefile && X# 3B1 makefile for pipe speed testing X# XCC = gcc XCFLAGS = -O XLDFLAGS = -s XLIBS = /lib/crt0s.o /lib/shlib.ifile XNAME1 = send XOBJS1 = send.o XNAME2 = recv XOBJS2 = recv.o XNAME3 = pass XOBJS3 = pass.o X Xall : $(NAME1) $(NAME2) $(NAME3) X X$(NAME1): $(OBJS1) X $(LD) $(LDFLAGS) -o $(NAME1) $(OBJS1) $(LIBS) X X$(NAME2): $(OBJS2) X $(LD) $(LDFLAGS) -o $(NAME2) $(OBJS2) $(LIBS) X X$(NAME3): $(OBJS3) X $(LD) $(LDFLAGS) -o $(NAME3) $(OBJS3) $(LIBS) X Xclean : X rm -f $(OBJS1) $(OBJS2) $(OBJS3) core *~ SHAR_EOF $TOUCH -am 1126050290 Makefile && chmod 0644 Makefile || echo "restore of Makefile failed" set `wc -c Makefile`;Wc_c=$1 if test "$Wc_c" != "485"; then echo original size 485, current size $Wc_c fi # ============= pass.c ============== echo "x - extracting pass.c (Text)" sed 's/^X//' << 'SHAR_EOF' > pass.c && X/* pass.c X * X * just passes/handoffs chars from stdin to stdout until EOF for testing X * the speed of pipes on the system. X * X * Thad Floryan, 26-Nov-1990 X */ X X#include X Xmain() X{ X int c; X X while ( (c = getchar()) != EOF ) putchar(c); X X} SHAR_EOF $TOUCH -am 1126045490 pass.c && chmod 0644 pass.c || echo "restore of pass.c failed" set `wc -c pass.c`;Wc_c=$1 if test "$Wc_c" != "247"; then echo original size 247, current size $Wc_c fi # ============= recv.c ============== echo "x - extracting recv.c (Text)" sed 's/^X//' << 'SHAR_EOF' > recv.c && X/* recv.c X * X * just receives chars from stdin until EOF for testing the speed X * of pipes on the system. X * X * Thad Floryan, 26-Nov-1990 X */ X X#include X#include /* for def of HZ */ X#include X#include X Xmain() X{ X extern long times(); X X long startime, endtime, elapsed; X struct tms timebuf; X long numchrs = 0; X X startime = times(&timebuf); /* get start time in HZ units */ X X while ( getchar() != EOF ) ++numchrs; X X endtime = times(&timebuf); /* get completion time in HZ units */ X X if ( (elapsed = endtime - startime) != 0L ) X { X printf("%d characters received in %d.%03d seconds for %d CPS\n", X numchrs, X elapsed / HZ, X ((elapsed % HZ) * 1000L) / HZ, X ((numchrs * HZ) / elapsed )); X } X else X { X printf("Insufficient timer resolution for supplied input\n"); X } X} SHAR_EOF $TOUCH -am 1126045390 recv.c && chmod 0644 recv.c || echo "restore of recv.c failed" set `wc -c recv.c`;Wc_c=$1 if test "$Wc_c" != "824"; then echo original size 824, current size $Wc_c fi # ============= send.c ============== echo "x - extracting send.c (Text)" sed 's/^X//' << 'SHAR_EOF' > send.c && X/* send.c X * X * just sends argv[1] number of characters out for testing the speed X * of pipes on the system. X * X * Thad Floryan, 26-Nov-1990 X */ X X#include X Xmain(argc, argv) X int argc; X char *argv[]; X{ X long numchrs; X X numchrs = atol(argv[1]); /* dismiss error checks for now */ X X while ( --numchrs >= 0L ) putchar('X'); X} SHAR_EOF $TOUCH -am 1126044190 send.c && chmod 0644 send.c || echo "restore of send.c failed" set `wc -c send.c`;Wc_c=$1 if test "$Wc_c" != "332"; then echo original size 332, current size $Wc_c fi # ============= test.sh ============== echo "x - extracting test.sh (Text)" sed 's/^X//' << 'SHAR_EOF' > test.sh && Xecho "\nsend | recv\n" X./send 100000 | ./recv X./send 200000 | ./recv X./send 300000 | ./recv X./send 400000 | ./recv X./send 500000 | ./recv X./send 1000000 | ./recv Xecho "\nsend | pass | recv\n" X./send 100000 | ./pass | ./recv X./send 200000 | ./pass | ./recv X./send 300000 | ./pass | ./recv X./send 400000 | ./pass | ./recv X./send 500000 | ./pass | ./recv X./send 1000000 | ./pass | ./recv SHAR_EOF $TOUCH -am 1126175890 test.sh && chmod 0755 test.sh || echo "restore of test.sh failed" set `wc -c test.sh`;Wc_c=$1 if test "$Wc_c" != "411"; then echo original size 411, current size $Wc_c fi exit 0