Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!mailrus!ames!purdue!haven!vrdxhq!daitc!daitc.daitc.mil From: jkrueger@daitc.daitc.mil (Jonathan Krueger) Newsgroups: comp.databases Subject: Re: Dumb Question (Outer Join) Message-ID: <538@daitc.daitc.mil> Date: 9 Jun 89 16:37:33 GMT References: <423@cimshop.UUCP> Sender: jkrueger@daitc.daitc.mil Reply-To: jkrueger@daitc.daitc.mil (Jonathan Krueger) Organization: DTIC Special Projects Office (DTIC-SPO), Alexandria VA Lines: 345 In article <423@cimshop.UUCP>, davidm@cimshop (David Masterson) writes: >I know I'm going to kick myself when I find the answer to this, but... No, you mustn't, you mustn't!! Your mother and I thought you were over your autism! :-) >What is the method for doing an outer join in SQL? For instance, given: There is none. NOW will you STOP KICKING YOURSELF??? OK, for better exposition, I'll clip the relevant RTI technical note. -------------------------cut here----------------------------------- INGRES Technical Note #79 ========================= Outer Joins ----------- INGRES Version : all Operating System : VMS Date Written : 26-May-1987 Date Modified : What is an Outer Join? ---------------------- When a relational join is performed, data from either table can be obtained only if the joining field(s) contains matching data in both tables. There are circumstances, however, when one, or both, of the columns to be joined has values that the other column does not have. In such situations, it is not unusual to require that a query give a listing of both matched and non-matched values, along with other data values from those rows where possible. This is an "outer join". As an example, suppose a PERSONNEL table lists all employees. An APPRAISAL table lists the names of employees who have been evaluated, with their ratings. You want a list of all employees, with ratings for those evaluated, and a "0" rating for those not yet evaluated. For this outer join, we can refer to the PERSONNEL table as the "outer" table, and the APPRAISAL table as the "inner" table. This paper describes two ways to accomplish the outer join. The first method involves the creation of temporary tables to hold intermediate results and to store the final outer join. The second method involves the use of QUEL's aggregate functions, without temporary tables. Sample Data ----------- The data for the following discussion are the two tables: PERSONNEL APPRAISAL ------------ ------------------- | name | |name |rating| ------------ ------------------- | Adams | |Adams |7.5 | | Brown | |Dixon |9.2 | | Campbell | ------------------- | Dixon | | Estes | ------------ Technical Note #79 Outer Joins Page 2 Method 1 - Using Temporary Tables --------------------------------- Below is a description of how to do an outer join in four steps. The first column describes the step, the second column shows the results of that step, the third column shows the QUEL code that performs that step, and the fourth column shows the SQL code that performs that step. Description Result QUEL SQL =========== ====== ==== === Make a temporary TEMP range of p create table table that has the ------- is personnel temp as unique values of Adams retrieve into select name of the join field Brown temp from personnel p from the outer Campbell (p.name) table. Dixon Estes Perform the join FINAL range of a create table of the outer and ------- is appraisal final as select inner tables; put Adams 7.5 retrieve into p.name,a.rating the result in a Dixon 9.2 final (p.name, from appraisal a, table. a.rating) personnel p where p.name= where p.name= a.name a.name Delete from the TEMP range of t delete from temporary table ------- is temp temp t the rows that have Brown delete t where at least one Campbell where t.name= t.name= matching row in Estes a.name any(select the inner name from table. The appraisal) temporary table now contains the values in the outer table that have no match in the inner table. Append the FINAL append to insert into temporary table ----- final final to the join result Adams 7.5 (t.name, select name table. Use a Dixon 9.2 rating=0) rating=0 code or value for Brown 0.0 from temp; the column that has Campbell 0.0 no value. Estes 0.0 Let's modify our sample data slightly to show an extension of the outer join example above. Suppose the PERSONNEL table is updated daily. It always contains all of the currently active employees. The APPRAISAL table, however, is updated monthly and may contain the names of employ- ees no longer with the company. These names would not be in the PERSON- NEL table. In the method above, the outer join would contain non-match Technical Note #79 Outer Joins Page 3 names from PERSONNEL, but not the non-match names from APPRAISAL. How could we obtain the non-match names from APPRAISAL, as well as those from PERSONNEL? The four steps below describe how to do it. We've added one row to APPRAISAL to show this. The data are: name = Farrell, rating = 8.3 . Description Result QUEL SQL =========== ====== ==== === Create a temporary TEMP range of p create table temp table containing ---- is personnel (name c20, all rows from both Adams 0.0 range of a rating f4); tables. Brown 0.0 is appraisal insert into temp Campbell 0.0 create temp select name, Dixon 0.0 (name=c20, rating=0 from Estes 0.0 rating=f4) personnel; Adams 7.5 append to insert into temp Dixon 9.2 temp select name, Farrell 8.3 (name=p.name, rating from rating=0) appraisal; append to temp (name=a.name, rating= a.rating) Create a second FINAL retrieve into create table final table that holds ------- final (p.name as select name, the rows having Adams 7.5 a.rating) rating from the joining field Dixon 9.2 where p.name= personnel p, values in both a.name appraisal a tables. where p.name= a.name; Delete the matches TEMP range of t delete from temp from the first ------- is temp where temp.name= temporary table. Brown 0.0 range of f final.name; Campbell 0.0 is final Estes 0.0 delete t Farrell 8.3 where t.name= f.name Append the FINAL append to insert into final remaining rows ------- final select * from in the first Adams 7.5 (v.all) temp; temporary table Dixon 9.2 to the second Brown 0.0 temporary table. Campbell 0.0 Estes 0.0 Farrell 8.3 Technical Note #79 Outer Joins Page 4 Method 2 - Using Aggregate Functions ------------------------------------ The QUEL language (unlike SQL) offers an alternative method for perform- ing an outer join. This method relies on syntax rather than temporary tables. With the same data that we used in the previous method, we can do a sim- ple outer join that counts the number of matching rows in the inner table, and includes a count of 0 for values in the outer table that have no match in the inner table. Here is the retrieval and the resulting table: range of p is personnel range of a is appraisal retrieve (p.name,n=count(a.name by p.name where a.name=p.name)) --------------------- |name |n | |-----------|-------| |Adams |1 | |Brown |0 | |Campbell |0 | |Dixon |1 | |Estes |0 | --------------------- If we want to obtain the actual rows, and not just count them, then we have to use the aggregate function "any". If a retrieved row passes the qualification, then "any" returns a value of "1". A "0" is returned if the qualification is not passed. If we put a join in the "any" function, we are, in effect, testing whether a value is shared between two tables. By requiring that the function return a "0", we are asking for those values not shared by the two tables. If we combine that qualification with the usual join syntax, we obtain the rows that do have a shared value, as well as those that do not - an outer join. There is one extra step required to use this method. A dummy row must be added to the inner table to provide values for the data when no joining values are found. These can be codes or standard values that you have decided represent "missing data". Here is the QUEL syntax for performing the outer join: Technical Note #79 Outer Joins Page 5 range of p is personnel range of a is appraisal append to appraisal (name=" ",rating=0.0) retrieve (p.name,a.rating) where (p.name=a.name) or ((any(p.name by p.name where p.name=a.name) = 0) and a.name= " ") The result of this retrieve is the following table: +----------------------+ |name |rating | |-------------|--------| |Adams |7.5 | |Brown |0.0 | |Campbell |0.0 | |Dixon |9.2 | |Estes |0.0 | +----------------------+ First, the retrieve does a standard join by the "where (p.name=a.name)" syntax. Then, rows in PERSONNEL that do not have a match in APPRAISAL are included because of the "any...= 0" syntax. When they are included, they are joined to the dummy row in APPRAISAL by the "and a.name..." syntax so the dummy values can be included for the data fields. Summary ------- An outer join is a join of two tables in a database in which the values of the joining field(s) do not always occur in both tables. Outer joins can be performed in INGRES using QUEL or SQL code to build temporary tables that will hold intermediate data and the final outer join. QUEL is capable of performing an outer join without temporary tables, using the aggregate function "any" . Technical Note #79 -------------------------cut here----------------------------------- Hope this helps. -- Jon --