Path: utzoo!attcan!uunet!lll-winken!lll-tis!daitc!jkrueger From: jkrueger@daitc.daitc.mil (Jonathan Krueger) Newsgroups: comp.databases Subject: Re: INGRES/EQC and SQL/C ...and general notes... Summary: convert or die Message-ID: <166@daitc.daitc.mil> Date: 27 Aug 88 19:09:33 GMT References: <24546@bu-cs.BU.EDU> <2401@rtech.rtech.com> <24585@bu-cs.BU.EDU> Reply-To: jkrueger@daitc.daitc.mil.UUCP (Jonathan Krueger) Organization: Defense Applied Information Technology Center, Alexandria VA Lines: 107 In article <24585@bu-cs.BU.EDU> berlin@buita.bu.edu (David Fickes, Einstein Project) writes (some material trimmed): >Yes a simpler QUEL statement will do it .. BUT ONLY IF ALL OF THE RESULTS >ARE A CORRECT DATE... take the string "00/00-02/89?" and put it through our two >statements as one of 40,000 records ... >WHY CAN'T WE NEST THE DARN QUEL STATEMENTS? You're confusing three issues: [1] Why can't I tell INGRES to relax atomicity of single statement transactions? [2] Why can't type conversion functions return an error value flagging problem arguments, rather than or as well as aborting the transaction? [3] Why can't EQUEL provide more convenient iteration constructs, in particular beyond those offered by QUEL? This would allow the programmer to work around problems such as the above, and would also be useful for their own sake. Issues [1] and [2] have been discussed before in this forum and elsewhere. I can't think of any way for vendors to allow [1] without also allowing you to subvert the relational model. Then when our programs did strange and difficult to predict things, Ed and Chris would say "We told you so" and there would be great wailing and gnashing of teeth throughout the land. But I think [2] could be implemented cleanly. However, Major thesis: we don't want special values in each data type to indicate illegal, similar to IEEE "not a number" values for floating point types. The reason is that they can't avoid the extra complexity and we can. If a column is of type date, we want to know that all values are legal dates. Neither do we want a generic untyped BAD flag similar to MISSING. Instead of flagging a mistake they would store it in the database, in effect gradually corrupting it. MISSING values, while also adding complexity, at least can be traced back through an audit trail. They aren't derived from functions of other values, they're just set or cleared. Type conversion failures can be high order side effects very difficult to trace. If we find a broken window MISSING allows us to find the fingerprints on the rock, BAD just tells us the time and proximate cause. Minor thesis: type conversion functions can't flag conversion failures, since they can only return successes. Type conversion functions return objects of the target type. By the major thesis, there are no predefined "bad" values for each type. Of course, the functions could be redesigned to also return status codes, but the assignment syntax for multiple return values would require great changes to the language, might not even permit upward compatibility of current programs. Even if this could be done, you're still calling the function in different way, the status code becomes an argument in the where clause instead of the object being a source value in the target list. Therefore: to handle type conversion failure cleanly, you would have to perform separate tests using different functions (or the same functions called in different ways). If you don't perform the tests to avoid "falling into" the type conversion failure, we still abort the entire transaction. So for bulletproof applications, you write extra code. E.g., for an interactive application: /* find the guys that won't convert */ retrieve (d.doc_date) where d.date=" " and squeeze(d.doc_date)="00/00/??" and isdate("01/01/"+right(squeeze(d.doc_date),2)) = 0 For an automated application: /* don't do possible updates without flagging impossible ones */ begin transaction /* do the possible */ replace d (date = date("01/01/"+right(squeeze(d.doc_date),2))) where d.date=" " and squeeze(d.doc_date)="00/00/??"; and isdate("01/01/"+right(squeeze(d.doc_date),2)) = 1 /* flag the impossible */ retrieve into problem_d (d.all) where d.date=" " and squeeze(d.doc_date)="00/00/??" and isdate("01/01/"+right(squeeze(d.doc_date),2)) = 0 end transaction Now, if we agree that relaxing atomicity [1] is a bad idea, then providing better iteration constructs [3] is neither necessary nor sufficient to answer David's original question. He still needs some way of deciding what to do inside his nested loop when an individual replace fails. He has to be able to distinguish type conversion failures from any other update failure, e.g. violating an integrity constraint on d.date. But solving problem [2] is necessary and sufficient, as shown above. Therefore, while [3] would be great for other reasons, it's not appropriate here. Finally, looking toward the day that commercial dbms support user defined abstract data types, their type definition facility must provide a way to define conversion functions and also functions to check for conversion failure. E.g. if I define a domain "color" drawing from {red,orange,yellow,gree,blue,indigo,violet}, ordered low to high as shown left to right, I should be able to query: /* print the observations by color where it's known */ retrieve (colorful_obs = color(right(squeeze(d.string),2))) where iscolor(right(squeeze(d.string),2)) = 1 sort by colorful_obs -- Jon -- Jonathan Krueger uunet!daitc!jkrueger jkrueger@daitc.arpa (703) 998-4777 Inspected by: No. 15