Path: utzoo!utgpu!watmath!att!pacbell!rtech!menace!dennism From: dennism@menace.rtech.COM (Dennis Moore (x2435, 1080-276) INGRES/teamwork) Newsgroups: comp.databases Subject: Re: Extended RDB vs OODB Summary: Where do these facts come from? Keywords: OODB C++ RDBMS Message-ID: <3367@rtech.rtech.com> Date: 15 Aug 89 16:57:35 GMT References: <3560052@wdl1.UUCP> <411@odi.ODI.COM> <458@cimshop.UUCP> <2177@cadillac.CAD.MCC.COM> <20@dgis.daitc.mil> <2230@cadillac.CAD.MCC.COM> Sender: news@rtech.rtech.com Reply-To: dennism@menace.UUCP (Dennis Moore (x2435, 1080-276) INGRES/teamwork) Organization: Relational Technology, Inc. (Opinions expressed are the writers own) Lines: 113 In article 3452 of comp.databases, speyer@joy.cad.mcc.com (Bruce Speyer) writes: >In article <20@dgis.daitc.mil> jkrueger@dgis.daitc.mil (Jonathan Krueger) writes: >>speyer@joy.cad.mcc.com (Bruce Speyer) writes: >> >>>If an application must cross its process boundary in order to >>>communicate with the database system it probably is at least two orders >>>of magnitude too slow. That is why all of the C++ based OODBMS efforts >>>are using the application memory heap for the cache. >> >>Could you provide some performance measurement data that qualify >>and quantify this assertion? >> >>-- Jon > >No, I don't have the numbers or the time to work them up. Perhaps somebody else >could provide actual statistics and even disprove my assertion. It would be >interesting to hear from somebody involved with the HP Iris system which is >based upon a relational database. > It is true that changing contexts takes a small number of milliseconds, depending primarily on the architecture of the CPU (i.e. an 80x86 takes a long time, because it is a segmented architecture, 68x00 takes the same amount of time to do a kernel call as a non-kernel call). However, you must do a context switch to call a C++ library routine or to call a database routine, so there is not much difference there. The difference in instantaneous response present currently in most DBMS's (OO *OR* R) is that they are client-server (or multi- server, in the case of INGRES (caveat -- I work for RTI and INGRES is our product)). This means to access data, you use IPC (inter-process communication) rather than a function call. IPC generally is much slower than a function call, but let's not forget one *MAJOR* saver here -- the SAME server can serve literally hundreds of users. If each had it's own linked copy of the C++ data access routines, there would be so much swapping/paging going on on the host, that nothing would get done. Even if linked libraries were used, each user would have her own data segments etc., and would use many more resources than the DBMS does currently. Therefore, I have no issue with the claim that a single user system is better off with a highly tuned, memory hogging, specialized access method, than an RDBMS. >About 3 years ago I tried putting an electronic information model on top of a >relational system. It took about 30-40 times longer to netlist a circuit then >it did using a fairly inefficient internally developed memory-based database >system. An operation such as packaging the electronics is much worse since it >must transverse much more of the electronic information model and be constantly >refering to the library portion of the model which was distributed to another >database (making the join operation much more expensive). > Excuse me, have you heard of distributed database? INGRES*STAR would allow you to keep your packaging information in a separate "database," and still do joins just as if the data were in the same database. The concept of "a database" (as opposed to "a different database") basically goes away, as the user can pick and choose tables from multiple "databases" to be in one STAR database. Maybe the reason it was slow was that you didn't know what you were doing. Let me posit a different architecture for your electronic information model. Could you have read in all the data into memory from an RDBMS and performed the same manipulations in-core that you did in your system? The advantage to this architecture is that you can lock the records while you are manipulating them (with THREE WORDS ("FOR DIRECT UPDATE"), as opposed to many lines of code), you get all the transaction processing capabilities of the DBMS (i.e. rollback, savepoints, commit), you get all the utilities of the DBMS, etc. To put it in a few words, YOU GET THE *MS* FROM THE DBMS, and you do your own processing. >Compare the cost of processing a tuple at a time to a C++ style database. If >the object is in-memory then optimally an indirect reference and a test is all >that is required to transverse a relation or access an attribute. > What a surprise! In INGRES, there is a concept called a TABLE FIELD (NOTE -- many other databases (such as Gupta's RESULT SETS, Sybase's SETS, etc.) have the same concept with other terms). You select a SET AT A TIME, NOT A TUPLE AT A TIME into the TABLE FIELD. BTW, do you know that a database oriented to TUPLE AT A TIME processing is not relational? By definition, a relational database can process a SET AT A TIME. For instance, if the diagram tuple has a surrogate key DIAGRAM#, which is a foreign key for the components table (which I will call COMPONENTS), then you could find all the components of a diagram by the following SQL statement: SELECT * FROM COMPONENTS WHERE DIAGRAM# = :diagram_number; where diagram_number is a C variable (for instance) containing the number of the host diagram. The results of this select could be stored in a table field and manipulated in core. BTW, all the table field manipulations (i.e. INSERTROW, DELETEROW, etc.) are in our language, so you don't have to write list processing classes -- we already did. So, in summary, whether you use an OO system or an RDBMS (which has OO features and capabilities), you can process the data in memory. You STILL have to get that data from disk and to disk SOMETIME, and the RDBMS will be better at that. In addition, the RDBMS already comes with the in-memory manipulation features. The RDBMS also protects against hardware and software (i.e. the break key) failures and provides you with the capability to start off a process and then back out if you don't like the results. The RDBMS is optimized to provide consistency and concurrency for the data. The OO "faction" here kees talking about what RDBMS's don't do, and yet every example so far has been doable with an RDBMS today. I am *SURE* that there *ARE* things that an OODB can do, but RDBMS's are developing new features faster (there are more people in engineering in *MY* company than in their whole company) and faster. I would like to point out that only two people are doing this rather poor defense of the entire OODB industry. After all, if OO was not a good idea, we wouldn't be developing even more OO features now. >My apologies for not being able to back up my statements with benchmarks. 'Nuff said ... >Bruce Speyer / MCC CAD Program WORK: [512] 338-3668 >3500 W. Balcones Center Dr., Austin, TX. 78759 ARPA: speyer@mcc.com > > -- Dennis Moore, my own opinions, etc etc etc