Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10.3 4.3bsd-beta 6/6/85; site ll-xn.ARPA Path: utzoo!linus!gatech!seismo!ll-xn!glenn From: glenn@ll-xn.ARPA (Glenn Adams) Newsgroups: net.unix Subject: Re: Asynchronous I/O on UNIX? Message-ID: <241@ll-xn.ARPA> Date: Wed, 20-Nov-85 16:44:04 EST Article-I.D.: ll-xn.241 Posted: Wed Nov 20 16:44:04 1985 Date-Received: Thu, 21-Nov-85 22:05:16 EST References: <248@well.UUCP> Organization: MIT Lincoln Laboratory Lines: 107 Keywords: UNIX I/O Summary: Some ideas on asynchronous I/O. > > As more and more mainframe systems are moving to UNIX, I am > very interested in finding out how asynch I/O is being implemented > on these systems. > This was one of the first complaints I had about UNIX after having used operating systems which imposed fewer constraints on how user processes performed synchronization on I/O completion. There are a few things worth contemplating here. Firstly, why there is no asynchronous I/O mechanism in UNIX, and secondly, how may such a mechanism be implemented. Before getting down to details, it should be pointed out that there are other, conceptually cleaner methods for performing overlapped I/O. That is, using multiple processes each of which have no more than one outstanding (blocking) I/O request. This form of overlapped I/O results in a conceptually straightforward implementation but is costly in terms of efficiency. This is especially true due to the hard boundaries maintained between process address spaces and the lack of a shared memory mechanism (non SYSV). In addition, the overhead from context switching contributes to an overall inefficiency. Moreover, the current mechanisms in UNIX for interprocessor communication, e.g., pipes, sockets, or files, all result in the copying of data to and from the kernel address space as it is being transferred to the destination process. This introduces more inefficiences. There is, however, the select() system call in 4.[23]BSD UNIX which allows a timed blocking poll of multiple potentially outstanding I/O activities. This is many times more efficient than the previous busy wait polling method which used the FIONREAD ioctl(), and this latter method was usable on a limited number of I/O activities, e.g., read(). Given these various mechanisms for performing multiple I/O activities, most applications have chosen to make do with them rather than address the more difficult task of implementing a more general kernel-based asynchronous I/O mechanism. My efforts originated while implementing an I/O intensive signal processing application under RSX-11M/S using Whitesmith's C. My first job was to throw out the junk Whitesmith's called a standard I/O library and make it more similar to UNIX V7. I actually used the 4.2BSD stdio with the addition of most V7 system calls which were mapped to RSX Executive Calls of one sort or another. Since, for this application, I had strong need of the efficiency of asynchronous I/O, I needed some UNIX like mechanism for implementing it. What I ended up with is as follows: prior to performing an I/O operation, e.g., read(), write(), or ioctl(), an fcntl() call is performed with a command argument of F_ASYNC, and an argument which points to an Asynchronous Control Block. This argument structure contains the address of the asynchronous I/O handler and an optional argument to be passed to the handler. The optional argument is used to communicate application specific information to the handler about the subsequent I/O activity. The handler is invoked upon I/O completion as follows: (*handler)(status, opt-argument); Thus the status code indicating the success/failure of the I/O activity is communicated along with the optional arugment specified in the Asynchronous Control Block. It may be argued that a cleaner mechanism could be implemented, especially since this calls for two stages, i.e., arming and execution phases. However, I felt that it was better to do it this way than to add another argument to all I/O related system calls, or even worse, to add yet more system calls. >From the application programmer's perspective, this mechanism is quite simple to use and builds upon existing system calls. The semantics of handler invocation are quite simple and result in a clean interface with minimal global data communication. Furthermore, since this mechanism enables asynchronous notification on a per descriptor basis, it is possible to have outstanding I/O on multiple descriptors. Further still, since an optional argument is specified on a per I/O request basis, i.e., the optional argument in the Asynchronous Control Block, it is possible to have multiple outstanding I/O requests on a single descriptor and use this optional argument to identify the request. For the application for which I implemented this mechanism, it was necessary to have overlapping I/O on multiple devices and to have multiple outstanding requests enqueued to a single device. The latter was necessary to reduce I/O turnaround latency on devices with very small data overrun periods, e.g., an unbuffered A/D converter. I haven't mentioned a few details here such as the obvious need for blocking out sections of critical code from incurring asynchronous entry. Now that I have had some success with this particular interface mechanism for performing Asynchronous I/O, I am consdering how it might best be implemented in the 4.[23]BSD environment. I haven't scoped the problem enough at this point to be able to state how difficult this will be. One problem that I see already is the fact that different device drivers use different mechanisms to perform synchronization. Some use iowait(), and others call sleep() directly. If a single mechanism were used, e.g., iowait(), then the task would be much easier. Those drivers that use iowait() could now be easily converted to enable asynchronous notification since a hook could be placed in iowait() to allow the process to continue and then to notify the user process when the driver calls iodone(). However, the other drivers would be much more difficult since they don't necessarily follow this strict protocol, i.e., calling iowait() and then iodone(). The actual notification could come via the psignal() mechanism with a special signal (SIGIO ?) being used to get things going. I'm not sure when or if I will get an opportunity to try implementing these ideas in the UNIX kernel; however, I thought it might be interesting to discuss the ideas that I've had on the subject in case you or others are interested in actually doing the implementation. -- Glenn Adams MIT Lincoln Laboratory ARPA: glenn@LL-XN.ARPA CSNET: glenn%ll-xn.arpa@csnet-relay UUCP: ...!seismo!ll-xn!glenn ...!ihnp4!houem!ll-xn!glenn