Newsgroups: sci.virtual-worlds Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!batcomputer!cornell!uw-beaver!milton!hlab From: hht@sarnoff.com (Herbert H. Taylor x2733) Subject: More on VR Architectures [LONG] Message-ID: <1991Apr5.030424.16993@milton.u.washington.edu> Sender: hlab@milton.u.washington.edu (Human Int. Technology Lab) Organization: Human Interface Technology Lab, Univ. of Wash., Seattle. Date: Thu, 4 Apr 91 21:27:08 EST Approved: cyberoid@milton.u.washington.edu Chris Shaw has challenged a number of our assertions about VR processing requirements while sustaining strong opinions of what VR is or is not, particularly in regard to the use of exclusively CGI based worlds. With our moderators indulgence we would like to respond to the technical content of those arguments as they pertain to VR. Ivan Sutherlands extraordinary vision in 1965 of the ultimate computer display and interaction (as reported by Fred Brooks) remains the most succinct summary of what today we call VR: 1. Display as a window into a virtual world. 2. Improve image generation until the picture in the window looks real. 3. Computer maintains world model in real time. 4. User directly manipulates virtual objects. 5. Manipulated objects move realistically. 6. Immersion in virtual world via head-mounted display. 7. Virtual World also sounds real, feels real. With the possible exception of the head-mounted display I would expect that these remain the essential elements of VR. It is not so much the physical apparatus ("head-mounted") as the effect of total "immersion" which is critical to VR. If the same "feel" can be achieved by other means then I believe we still have VR. Today, however, the head-mounted display IS clearly the best way to achieve that effect. The ultimate goal of this research is realism (even "real" realism) within the Virtual world. That is not to say that virtual world participants must appear as they "really" are in the VR. For some of us that could be a bit of a bummer. However, people and objects must be recognizable in some context. Each time I return to the VR my "neighbors" should be able to say - "Hey, there's Herb." It was my contention that there is nothing in VR which requires that worlds be polygonally based. That is a choice made specifically to meet the above criteria - not fundamental to the experience. I further concluded that other approaches to the world processing function would require less floating point operation. It was not my intention to preclude polygon rendering, nor floating point - which is critical to scientific computation, as Alan has pointed out. (In fact, we can turn a GFLOP with 2048 16bit DSP processors) However, I certainly hope researchers are and will continue to look for alternative methods to build and manage virtual worlds then strictly graphics based methods - that was the real point. This is clearly an important research topic. I don't know about Chris Shaw, but when I look in my headset I would much rather see a real looking (if "imaginary") person on the other side of the virtual world then a SEGMENTROID (TM) - gouraud shaded or not <8-) In my original post I had said: ** >Other approaches forego polygon rendering entirely and hence ** >possess little or no floating point. And Chris Shaw asked: ** Other approaches such as what, I wonder? In a workshop presentation at SIGGRAPH 89 Scott Fisher outlined future research issues of VR. These included the objective of "combining real and virtual environments". In a paper included in the course notes of that workshop Fisher states that the Ames Virtual Environment Workstation "system provides a multisensory, interactive display environment in which a user can virtually explore a 360-degree synthesized or REMOTELY SENSED environment and can virtually interact with its components." [Fisher] I presume therefore, that telepresence, telerobotics or teleoperation are considered worthy subfields of Virtual Reality? Certainly much of that work is non polygonal. Again, Fisher: "The virtual environment display system is currently used to interact with a simulated telerobotic environment. The system operator can call up multiple images of the remote task environment that represent viewpoints from free-flying or telerobot-mounted CAMARA platforms...Switching to telepresence control mode, the operator's wide angle, stereoscopic display is directly linked to the telerobot 3D camara system..." Similarly, if a system can process multiple simultaneous video streams and construct a real-time video 3D world with which the user can directly interact does that not qualify as VR? Ultimately we would like to walk through all sorts of complex 3D data sets, perhaps even volume rendered or terrain rendered ones (voxels, not polygons). A number of near real-time demonstrations of such walk throughs have been made - and systems such as PxPl5 and the Princeton Engine will be likely platforms for further performance gains in this area. Other examples of non polygonal approachs would include the famous video based Aspen driving simulation. Does this qualify as a VR? If a system can provide a completely user directed experience of driving in a "virtual" city (in this case modeled on a real one) is that not a virtual reality? Chris Shaw does seem to consider E&S simulators to be VR - is that only because they use graphics? Under the same criteria are not interactive video systems such as the DVI based Palenque walk-through - not also forms of VR? This system uses data compressed video scenes on CD which can be accessed and decompressed in real-time. If the user changes his or her point of view (albeit using a pointing device) the world processor locates the correct orientation on disk and makes it appear as if the user turned. The interaction meets Sutherlands criteria #1,2,3,and 7. The choice of pointing device was arbitrary - it would not be difficult to configure the system with a head mounted display (criteria #6) and to track body motion to provide a true sense of walking through the ancient ruins. Perhaps the most difficult of Sutherlands criteria involves the requirement for manipulated objects to move realistically under the control of the user (criteria # 4 and 5). These constraints are more difficult because they are in addition to the inherent real-time constraints of the world processor. It is my impression that in present VR systems the latency between object movement and visual feedback is the single most significant limitation to interaction and "feel". Presumedly, in flight or driving simulators the "user manipulated objects" are the vehicals themselves and their controls. In outlining what I viewed as the fundamental limitations of a single processor model for a virtual world processor (using the Cray as an example) I began by characterizing the number of elements in future VR displays - under the assumption that that represented an upper limit to the number of world elements which must be processed. I thought it was obvious that I was using a head-mounted display to establish a near term system of 1Kx1Kx30 frames per second. (Obviously workstation monitors already have had much higher resolution for several years.) The resolution of the head mounted display (in addition to frame rate and latency) has been identified as A MAJOR LIMIT TO VR. Presumedly, VPL and others are very hot after the very latest, highest resolution head mounted displays - which I merely pointed out will be here soon. LCD technology developed for rear projection HDTV will approach 800 lines per inch. It appears that this can be readily adapted to head mounted displays. Other technologies loom in various labs around the world. However, THE POINT I was making was that resolution was not the entire problem - as each head mounted display ultimately must have a continuously processed VIDEO source. If all you want is to redirect your SGI polygon world output to the head display, fine, no problem. Be happy. However, I hope in the research community we want much more. In particular I hope we want HDTV. I also argued that networked VR motivated "MPEG-like" data compression algorithms which in turn may rob cycles from our world processor. Chris Shaw was incredulous about that: ** What do you need MPEG for??? I can only say that we have been approached by a number of groups about the feasibility of networked VR (and other applications) with a heavy emphasis on data compression. Perhaps I'm thinking ahead with MPEG but there are and will be requirements for "on the fly" data compression, to be sure. Certainly interactive video applications such as those supported by DVI technology depend on data compression. Also, in my original posting I had speculated on the requirement for high frame rates in VR. The example I chose was motivated by a NASA workshop on High Frame Rate High Resolution video - where scientists were struggling with the problem of monitoring and controling space based experiments. These experiments must be controlled completely from the ground station - as the astronaut crew will not be able to interact with the testbed. Combined requirements for frame rates up to 1000 fps and resolutions up to 1Kx1K were stated - way beyond the combined state of the art. The conclusion of the workshop was that only by trading off spatial resolution could very high frame rates be possible. Now, while the ground-space interaction is not a candidate for VR because of the latency thru the down link, the captured data might very well be incorporated into a VR - in a "post operative" review of the experiment. I had said regarding the need for higher frame rates: ** >For example, to project images from a remotely sensed combustion experiment ** >will require hundreds of frames per second of aquisition - but in a burst ** >of only a few seconds duration. In order to walk through the data set ** >we must have considerable flexibility in the play back frame rate. To which Chris Shaw replied: ** This is very lovely, and presumably [... my favorite machine ] does a fine ** job. ** BUT. This isn't Virtual Reality (TM). This is [ ... something else ] Again, I was attempting to simply illustrate where high frame rate might be useful. Unfortunately, the example did not stimulate a creative nerve for Chris. However, one could easily imagine a system with a "continuous" high frame rate source but at reduced resolution. The point is that if you want to integrate real imagery from a variety of sensors into a VR you must have frame rate conversion algortihms. Chris asks: ** Where's the interaction? For telerobotic applications a remotely located, variable frame rate camara (or multiple camaras) sends a 3D burst back to the VR. This burst can be captured in virtual world managed frame buffers. There are two obvious modes of user interaction either via continuous teleoperation (at low resolution) or by traversing the captured 3D "volume" of video data. ** Can you change your point of view? To the extent that the system has been configured to support either of the two modes previously described. If 3D camaras are used then in principal one could change point of view within a captured video volume. Arbitrary point of view. Dare I mention that in the consumer television research community there is interest in "user directed intereactive tv" i.e. the home viewer controls the camara point of view, zoom factor, etc. The transmitted video contains an entire video volume into which you navigate. If all you want to do is stare at the fans in the stands, the "information" is there for you to do so. Look for it at about Superbowl 40. ** Can you change the experiment as it runs? No. In the specific example of space based experiments, no. However, within the limits of continuous teleoperation and if the timeframe of the experiment is long compared to human response times, then yes. Alternatively, within a captured volume of data you can change visualization controls (such as opacity) while you walk through the data. ** The situation you describe allows N camera views at pre-programmed ** locations. If you want a new view that your camera(s) didn't get, you ** have to run the experiment again. Not exactly correct. Again, within the continuous video volume there is sufficient information to construct a new and entirely arbitrary point of view of the experiment. ** There's nothing wrong with this, but it ain't virtual reality, because the ** level of interaction is severely limited. If level of interactivity defines VR then this system can be VR... Herb Taylor References: "Virtual Environments, Personal Simulation & Telepresence". Scott Sinkinson Fisher, Course Notes for SIGGRAPH Tutorial #29, 1989 [MODERATOR'S NOTE: Without endorsing Herb's points -- which must stand on their own -- I want to thank him for this impressive synthesis of points of view, as well as a nice statement of his own position. -- Bob J.]