INF: Using the "*=" and "=*" (Outer Join) Operators

ID: Q67753


The information in this article applies to:


SUMMARY

An outer join is defined as:

A join that includes all rows from the outer table, regardless of whether there is a matching row in the inner table. If there is no matching row in the inner table, or if the join condition is not met, the second table generates a "NULL" result as a match for that row.

The outer join operators are "*=" and "=*". The "*=" operator indicates that the first table is the outer table, and the second table is the inner table. The "=*" operator indicates the reverse: the first table is the inner table, and the second table is the outer table. In any join, restrictions other than the join condition can be placed on the outer table, but no restrictions beyond the join condition can be placed on the inner table.


MORE INFORMATION

Example

Consider the following operations on these two simple tables:

          Table1                       Table2
   ----------------------      --------------------------
   | Col1   |    Col2   |      |    Col3 |  Col4        |
   ----------------------      --------------------------
      1     |     1                  1   |    11
      2     |     2                  3   |    13
      3     |     3
      4     |     4 

A simple outer join that tests from equality between Table1.Col1 and Table2.Col3, designating Table1 as the outer table, and Table2 as the inner table, would look like the following:

   SELECT * FROM TABLE1, TABLE2
    WHERE COL1 *= COL3 

The result set from this operation will be:

   Col1       Col2     Col3     Col4
   ---------------------------------
    1     |     1   |   1    |   11
    2     |     2   |   NULL |   NULL
    3     |     3   |   3    |   13
    4     |     4   |   NULL |   NULL 

Note that rows 2 and 4 from Table1 are "joined" with NULL, due to the fact that there where no matching rows in Table2.

Reversing the direction of the join produces a different result set, and no rows joined with "NULL":

   SELECT * FROM TABLE1, TABLE1
    WHERE COL1 =* COL3 

The following result set is produced:

   Col1       Col2     Col3     Col4
   ---------------------------------
    1     |     1   |   1    |   11
    3     |     3   |   3    |   13 

The behavior of outer joins is fairly straightforward, as long as no restrictions beyond the outer join restriction is placed on the inner table. When restrictions are placed on the inner table of an outer join, the results may appear unpredictable. Actually, they are not unpredictable (within the context of a single implementation of SQL), and conform exactly to the rules governing outer joins. Consider the first example, with an additional restriction:

   SELECT * FROM TABLE1, TABLE2
    WHERE COL1 *= COL3
    AND COL4 = NULL 

Considering that the result set without the additional restriction is as follows

   Col1       Col2     Col3     Col4
   ---------------------------------
    1     |     1   |   1    |   11
    2     |     2   |   NULL |   NULL
    3     |     3   |   3    |   13
    4     |     4   |   NULL |   NULL 

it seems logical that the restricted result set would be the following

   Col1       Col2     Col3     Col4
   ---------------------------------
    2     |     2   |   NULL |   NULL
    4     |     4   |   NULL |   NULL 

(and it is, in some implementations of SQL; see "Further Explanation," below).

However, the definition of an outer join specifies that ALL rows from the outer table will appear in the result set, so the result set is actually:

   Col1       Col2     Col3     Col4
   ---------------------------------
    1     |     1   |   1    |   NULL
    2     |     2   |   NULL |   NULL
    3     |     3   |   3    |   NULL
    4     |     4   |   NULL |   NULL 

Further Explanation

Using additional restrictions on the inner table of an outer join is considered "undefined" behavior. All SQL engines will handle this syntax in a slightly different manner, and the behavior is only predictable with respect to a particular implementation. In the case of Microsoft SQL Server, the additional restrictions are actually incorporated into the base select statement generating the result set. Therefore, if you impose an additional restriction that creates a condition that is met, it will be displayed in the result set. For example, if you change the last example to test for the value "13" in column 4, rather than NULL, the result set will be:

   Col1       Col2     Col3     Col4
   ---------------------------------
    1     |     1   |   1    |   NULL
    2     |     2   |   NULL |   NULL
    3     |     3   |   3    |    13
    4     |     4   |   NULL |   NULL 

The other popular implementation is to take a "temp result table" from the join condition, and then apply the further restriction to it. This result set violates the definition of an outer join, because not all rows from the outer table will be displayed. However, since this is undefined behavior according to the ANSI standard, both positions can be defended. The result set that you get from the following test query

     SELECT * FROM TABLE1, TABLE2
       WHERE COL1 *= COL3
       AND COL4 = NULL 

is the following "intuitive" result set:

   Col1       Col2     Col3     Col4
   ---------------------------------
    2     |     2   |   NULL |   NULL
    4     |     4   |   NULL |   NULL 


Keywords          : kbusage SSrvServer 
Version           : 4.2
Platform          : OS/2 
Issue type        : 

Last Reviewed: March 10, 1999