Path: utzoo!utgpu!watserv1!watmath!att!att!linac!pacific.mps.ohio-state.edu!zaphod.mps.ohio-state.edu!julius.cs.uiuc.edu!apple!spies!zorch!xanthian From: xanthian@zorch.SF-Bay.ORG (Kent Paul Dolan) Newsgroups: comp.sys.amiga.tech Subject: Re: PIPEs Message-ID: <1990Nov18.150258.16061@zorch.SF-Bay.ORG> Date: 18 Nov 90 15:02:58 GMT References: <1990Nov10.082242.22949@agate.berkeley.edu> <8283@gollum.twg.com> Organization: SF-Bay Public-Access Unix Lines: 257 david@twg.com (David S. Herron) writes: > pete@violet.berkeley.edu (Pete Goodeve) writes: >> Kent Paul Dolan (xanthian@zorch.SF-Bay.ORG) writes: > [ an interesting idea which I (personally) think looks a bit > ugly and daunting to type from the keyboard ..] Perhaps, but easy and intutitive and solving many problems not currently capable of solution. You haven't seen "ugly" yet; see below. >> Hmmm. What an interesting idea... In fact it gets more intriguing as >> I think on it. "Fan out pipes" are something that even unix can't do, >> as far as I can see. (You can 'tee' to a pipe and a file, but not to >> two parallel pipes, can you?) And a convenient way of "broadcasting" >> data to a number of processes is something that I've had on my mind >> for a long, long time. > Unix has the capability of doing this - all the OS facilities are > there it's just that it's hard to represent it in a linear line of > text. Since Unix commands (under /bin/{,c,k}sh) are a linear line of > text this is a problem. In fact, I vaguely remember seeing a multi-way > `tee' program come across some sources group once. I don't think so. The example, reformatted below, explicitly has a process feeding a prior process, something which doesn't work with tempfiles. I could have included a loop, the notation supports it, but making it anything but an infinite data source then requires knowing the internals of the executing commands, a point I wanted to avoid raising in a first proposal. > Another problem is -- where might this be used? As I said above, the > notation you're suggesting is a bit ugly (to my eye). Anywhere you'd want a complex interconnection of interoperating programs; consider throwing up a half dozen requestors from a half dozen programs to control some multimedia spectacular. This allows them to be tightly coupled without being co-written. > It's not the sort of command I'd be typing in off the top of my head > and, besides, I've lived on Unix for years with just linear pipes. The > few times where it might've been nice to have tree-structures of > piped-together processes has always been in a shell script & it was > pretty easy to use temp files of my own & delete them when done. Except that temp files kill speed, hog space, and are quite inappropriate for the "immortal" processes that run from boot time to boot time in modern multitasking OSs. There was no suggestion that an extremely complex nest of multiple input, multiple output pipes had to be created _literally_ from the command line; like all complex processes, that model would give way in reality often to the scripts you suggest. > For an ad-hoc creation of tree-structured piles of processes it seemed > you'd want some sort of graphical shell in which you'd have process > names and pipe symbols that you could drag around & connect up & play > around with. This is certainly an alternative, and it has been tried several times; the lack of precision, problems of manipulation, and lack of a common agreement on terms and appearances have so far left the efforts mostly research toys. That doesn't lessen the ultimate value of a working "visual pipeboard" solution, but it does suggest that a simpler, shell oriented solution for the shell user has a place. > cmd1 | ( > tee /tmp/some-file; some-other-command ) | cmd2 > Hmm.. that just came to me, and is perfectly reasonable in Unix > shell-ese. The string of commands under "some-other-command" would not > even start until the "tee" process finishes, then any output it makes > would go into `cmd2', _following_ the output of `tee'. Also if you > iterate on the () (in Unix shell-ese that starts up a subprocess > executing the command string within ()'s) you can create a fairly > bizarre nesting of processes. Using `&' and `>' would send > some-other-command off into the background so that `cmd2' can finish > without having to wait for some-other-command to finish. But the chance for loops and avoiding temp files would still be missing, as would more crucial mechanisms not yet discussed; see below. I'm also not aware of uses for "&" to mean backgrounding except at line end; this may well be just my ignorance; I can't read most shell scripts I see. > But toss a few inline awk scripts into that pipeline and it quickly > becomes un{read,maintain}able. Hey, this is true of any computer science language you care to name; it hasn't stopped even sed, awk, and APL from being exceptionally useful. >> PIPE cmda args >-{1,2} + >> cmdb args >-{1,2,3} + >> -<1 cmdc args >-2 + >> -<2 cmdd args >-4 + >> -<3 cmde args >-1 + >> -<4 cmdf args > result >> >>This isn't much worse (to my eye!) in appearance that the other, > Well.. I guess there's just no pleasing some people. If I'm reading > this right you want the input of cmdd to be some combination of the > outputs of cmda, cmdb and cmdc. If each of those are executing > concurrently how do you avoid mixing the outputs? If you make sure the > outputs aren't mixed, how do you specify the order that cmdd see's. > Would it be in the order specified in the command, random, or what? More exciting and pertinent was the feed from cmde to cmdc, at least a nontrivial task in the current shells. I was more amazed that no one, in the context of AmigaOS, challenged my coopting of "{}" as metasymbols with the "obvious" meaning. The supposition that Amiga users are not Unix shell users in drag stands disproven, at least for that portion inhabiting USENet. You didn't have my advantage of sitting through Pete Goodeve's presentation of his pipe joiner at BADGE. The trick to keep things as straight as they need to be is that "things" are passed from pipe traffic producer to pipe traffic consumer in packets using AmigaOS's message and reply system. A simple flush() after each meaningful unit by the producer, combined with a sufficiently large pipe packet size, will serve to synchronize the pipe traffic into meaningful chunks for the consumer. >ohwell.. so long as unix-ese-speaking shell's are available then >I'll be happy ;-).. (And, no, I didn't "grow up" on Unix .. I've >used many many other kinds of systems .. I find the notation in >Unix shell script-ese (Bourne or Korn shell.. I *emphatically* >don't write programs with csh) to be very convenient & powerful.) Speaking of power ... ;-). I hadn't yet thrown my real spanner into the works, hoping to let this cud be first digested by the onlookers. The trouble with pipes, even if they contain fan-out and fan-in functionality just outside the bounds of the individual process with the acceptance for further effort of the above work, is that the present notation only usefully support processes that are, at least with respect to piped data, "filters": single input, single output. Filters are not the whole world, and, though tagged packetized data is a way to make a non-filter in concept act like a filter in reality with the addition of a smart fanout with respect to the tags, it would be nice to support the more interesting programs with m-way-data-in, n-way-control-in, o-way-data-out, p-way-control-out, 1-way-messages-out, and 1-way-log-out paths, a more realistic piece of the processing universe. In order to do this in a visual plumbing GUI, you have to have room for lots of "quick disconnect fittings" per process, with label and tags and "keying" to make sure the right pipe hits the right fitting and lots of other complexities that make the problem pretty intractible. It is much less complex to design this functionality into a script shell; you just have to provide a tag corresponding to the file handle for each pipe fitting, the stdin=0, stdout=1, stderr=2 of Unix filter processes the paradigmatic example. I won't claim to have any "non-ugly" ways, or ways that would be fun to type raw as opposed to editing into a script , to do this, but here is one possible example of how it could be done. I _will_ contend that making this possible from a scripting language would be nearly infinitely useful (if for no other reason than that it provides a model for the GUI solution), to allow multitasking support for high speed, low data storage overhead ways to interconnect and tightly couple processes by independent authors. Suppose we have three processes; the first eats a stream of text and a stream of editing commands, and emits the edited text and forward and reverse diff files. The second samples the system clock, times the edit commands and selectively feeds back new commands which vary depending on the time versus space efficiency of the current system load, and writes a log of its work, and a file of the time tagged load averages, to disk. The third does a consistency check of the diff files, feeds a control to the first when problems occur, and appends a join of its log and the second processes log to the input text file. All write any needed error messages to the system console. All run from boot up to shut down. (This is all nonsense created just to have an example, of course.) We will start with Pete's revision of my original proposal; why fight over trivia? However, despite his noting the ease of parsing if the inpipes are to the left, I have returned to the easier to read AmigaOS and Unix form with the command name first, after seeing how intensely ugly it looked the other way with this proposal's increased complexity burying the commands in mid line. Next is a summary of the three processes, providing file handle (fh) assignments, followed by a script entry to spawn and interconnect them. In hopes of making it all viewable on a single screen, vertical whitespace has been omitted, so you might want to single line step your newsreader through the next bit. It exactly fits in 24 lines with "process1:" at the top of the screen. I've made some arbitrary but reasonable guesses at file handles for clock and type. process1: reads text from fh0, editing commands from fh1, and diff control messages from fh2; it writes edited text to fh3, forward diff to fh4, reverse diff to fh5, copies its input commands to fh6, and logs errors to fh7. process2: reads the clock tics from fh0, reads command copies also from fh0 to get the best available times from interleaved messages, monitors process1's error log at fh1, runs at a higher priority than the other two processes, writes edit commands to fh2, writes a log to fh3, writes averages to fh4, and logs errors to fh5. process 3: reads the forward diff from fh0, the backward diff from fh1, and the process2 log from fh2; it writes added input for process1 to fh3, a joined log to fh4 and logs error messages to fh5. PIPE + clock -tic 20 >-6.0 + type -<<1.0 >-3.1 input_text + type -<<2.0 >-4.1 edit_commands + process1 -<3.0 -<4.1 -<5.2 >-.3 edited_text >-8.4 >-9.5 >-6.6 >-{7,11}.7 + changetaskpri 5 + process2 -<6.0 -<7.1 >-2.2 >-{9,12}.3 >-.4 averages >-11.5 + changetaskpri 0 + process3 -<8.0 -<9.1 -<10.2 >-1.3 >-.4 p3log >-11.5 + type -<11.0 >-.1 * + type -<12.0 >-.1 p2log & I _said_ it was ugly! ;-) If it isn't obvious, the m.n format is "pipename.filehandle", where the pipename belongs to the script and not really to the receiving process, but is named at the receiving process command line, and the pipes are (in this case, for tutorial purposes, not out of necessity), numbered sequentially vertically down the screen; while the file handles belong to the process on the same line. Sorry about having the clock output to file handle zero, though. ;-) So, for example, where process1 writes to the tee ">-(7,11}.7", the first half of the tee is being read by process2 at "-<7.1"; the pipename is "7", process1 is also using its filehandle fh7 (".7") (purely coincidence) to write the data, and process2 is usings its filehandle fh1 (".1") to read the data. Note that the next to the last "type" is synchronizing three inputs of the "stderr" outputs of process[123] to the console. Notice the omission of the pipename when input is from or output is to a file, but that the filehandle is still needed, thus several ">-.1 filename"s, for example. [I probably buggered the semantics of changetaskpri in there; do the obvious right thing.] I will freely confess that 1) this was an easy one, a hard one would get messy and require lots of parser working storage to process, 2) it took me over twenty minutes to type this easy one, and 3) I would never think of attempting to type this as a command, but would always make it a shell script, and then probably have to debug that. Nevertheless, it provides the kind of interconnection of processes lacking in Unix and its lookalikes at the command interface level. Lacking such facilities, processes like the above have to be designed and coded as monolithic projects, rather than loose coded and plugged together at the command interface. The seeming ugliness here would remove a much more intense process fork and file handle maintenance ugliness currently rife in the C code of large Unix programming suites. No, I can't write the code to make this work. Daydreaming about it passes the time. Kent, the man from xanth.