Path: utzoo!utgpu!water!watmath!clyde!att!osu-cis!tut.cis.ohio-state.edu!bloom-beacon!mit-eddie!uw-beaver!uw-june!pardo From: pardo@june.cs.washington.edu (David Keppel) Newsgroups: comp.lang.c Subject: C++ for science -- DAIMS overview Keywords: C++ science programming packages Message-ID: <5596@june.cs.washington.edu> Date: 30 Aug 88 18:30:09 GMT Reply-To: pardo@cs.washington.edu (David Keppel) Organization: U of Washington, Computer Science, Seattle Lines: 187 >[ So why don't you Scientists all use C++? ] As an enthusiastic bystander, I promised a summary of the Oceanography project. Almost as an afterthought, I asked Bruce Eckel for a summary of the project. I got back a summary from Bruce and a second summary from Tom Keffer. Thanks, guys! I've gotten a zillion e-mail requests, so I'm posting this to comp.lang.c and comp.lang.c++, since it seems to be of pretty general interest. Here goes: ----------------------------------------------------------------------- [ Bruce: ] The DAIMS acronym stands for Data Analysis and Interactive Management System. We are developing tools to manipulate and analyze large data sets (e.g. Ocean measurements, sattelite data). The tools will include: 1) A Data-Storage Standard 2) Graphics for easy display of data, mapping, etc. 3) A simple interpretive programming language to allow easy access for non-programmer types. The language will be extensible without too much trouble for programmer types. 4) As many data-related classes as we can create (i.e. matrices, vectors, oceans, etc.) 5) Interfaces to existing Fortran libraries 6) A model of the ocean programmed with C++ We won't accomplish all this; thus we are trying to at least establish a framework where we can't build the thing. We chose C++ because it is supposed to generate maintainable and extensible code, and make programmers more efficient. The latter is only true, it seems, if the programmer already understands the language and/or OO programming. The learning curve has slowed us down considerably. This is more of an experiment and an example. We can't really hope to convert all the scientific Fortran programmers out there to C++. Their productivity would immediately go to Zero for the amount of time it takes to learn the language. (Using someone else's classes, however, is remarkably easy -- this might be the saving grace). At the present, our code is designed to be freely distributed. Assuming the university lawyers don't complain. My understanding is that this project is intended to create PD code. You can download stuff from sperm.ocean.washington.edu via anonymous ftp; it is by no means a finished package but there are numerous useful C++ examples and classes. [ Note: get and read the READ_ME. For the other files, be sure to put ftp in `binary' mode, as the files are compressed ] We would *love* to have contributions for the project; we are actively seeking other groups/individuals to contribute code. Since there are really only two of us working on it (and we could probably keep six or eight busy on a project this size) any contributions are extremely welcome. ------------------------------------------------------------------------ [ Tom: ] That seems like a reasonable summary. Another thing worth adding is that the framework/architecture we develop should be useful for a variety of sciences, not just oceanography. For example, we have matrix classes that know how to invert themselves --- that's something useful in a zillion different fields. Pardo asks what packages we are building. That would most easily be answered in person. There's a zillion useful tasks I can think of (example: a C++ LINPACK interface; another example: a highlevel C++ graphics interface to InterViews / X Windows to do things such as draw axes, arbitrary projections & views of data, etc. ) Finally, here is an abstract I prepared for the INO summer colloquium: +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ A B S T R A C T The Data Analysis and Interactive Modeling System (DAIMS) is a project led by Thomas Keffer (Univ. of Washington) and Dale Haidvogel (Chesapeake Bay Institute) under the sponsorship of the Institute for Naval Oceanography to develop a system of software with five main goals: * to act as the user interface for an operational ocean forecasting system; * to serve as a productivity booster for the development of new models; * to verify the skill level of these models; * to assimilate, archive, and statistically analyze real-time data to be used with these models; and * to act as an educational tool for exploring models and large data sets. Our approach has been two fold. First, to develop a high-level interpreter to manipulate complex data structures and perform a suite of analysis techniques on them. Second, to develop two highly interactive ocean general circulation models, a quasi-geostrophic model and a primitive-equation model, that can be easily modified and reconfigured, even at run-time. The general goal is to develop models that are robust and easy to reconfigure without introducing errors. Operational constraints are system portability, efficiency, extendability, and adherence to standards. It is our intention to keep the results of DAIMS in the public domain, distributing the software through the Internet. This requires using a minimum of proprietary of software, eliminating licensing restrictions. There have been two meetings of the DAIMS working group. At these meetings it was decided to adopt an object-oriented architecture for the models and interpreter. This is the only practical way of managing a large project with a minimum of manpower, while ensuring the goals of the project. It is the goal of DAIMS to define this object-oriented architecture in such a manner as to make the construction of models and analysis software easy and dependable. The DAIMS project has chown C++ as its system programming language. This is an object-oriented language developed at AT&T that runs on a wide variety of UNIX and MS-DOS machines. It is a superset of the popular "C" programming language. Like other object-oriented languages, it offers inheritance of objects, encapsulation of information, and polymorphism (the functionality of a function call depending on object type). Other advantages are a systematic way of organizing large programs, strong type-checking to reduce errors, run-time efficiency, and easy interfacing with old C and FORTRAN code. The intention is the develop the high-level abstract views of the model using C++, but to call existing FORTRAN routines to do the intricate numerics (e.g., elliptic equation solvers). We now have an interpretor up and running. At this time, it is capable of only a few operations on a few simple objects. We will be extending its repertoire of objects and methods in the future. While our eventual goal is to write a highly interactive general circulation model, we have chosen to experiment with a simple one-dimensional ocean spinup demonstration model that can be run entirely on a workstation. The intention was to explore the object-oriented architecture and the user interface in an easy-to-manage environment. The model is a spectral model (using Chebyshev polynomials) that solves for the depth of the thermocline in zonal cross-section. The type and shape of forcing, time-stepping, boundary conditions, friction and other parameters can all be selected at run-time, using a mouse. It is designed to run on a Sun-3 workstation. It makes a useful teaching aid for a course on planetary waves and general circulation. The model is available via anonymous ftp from the host sperm.ocean.washington.edu (128.208.2.7). A technical report on the model is available directly from T. Keffer (School of Oceanography; WB-10; Univ. of Washington; Seattle, WA 98195). Experience with this model has helped define the emerging architecture of the system. Different physical domains within the system --- ocean turbulent boundary layers, atmospheres, basins, etc. --- are incorporated as different "objects" that can be created, manipulated, and displayed. In turn, these objects incorporate other objects that are more tuned to the numerics of the project --- grids of temperature, u velocity, etc. All objects include "state" variables (e.g., temperature) and independent parameters (e.g., diffusivity). Because of the object-oriented approach, each object is responsible for its own privately controlled data. Another object learns about it only by interrogating it in a systematic way. This approach allows easy changes to the model, minimizing "ripple-effects". For example, an active turbulence closure model can easily be substituted for parameterized boundary conditions. ------------------------------------------------------------------------ Bruce and Tom haven't given me permission to give out their names and e-mail addresses, but I will anyway, so please be reasonable in what you send them (reasonable enough that *I* don't get in trouble, anyway :-). Personally, I think this project is a really great thing and I'd like to see a lot of people get involved. Bruce Eckel: eckel@sperm.ocean.washington.edu Tom Keffer: keffer@sperm.ocean.washington.edu ;-D on ( Strange, I had that keyboard here just a minute ago... ) Pardo -- pardo@cs.washington.edu {rutgers,cornell,ucsd,ubc-cs,tektronix}!uw-beaver!june!pardo