Description and Usage of Joins in Microsoft Query
ID: Q136699
|
The information in this article applies to:
-
Microsoft Query for Windows, version 1.0
-
Microsoft Query for Windows 95, version 2.0
-
Microsoft Query for the Macintosh, version 1.0
-
Microsoft Query 98 Macintosh Edition
SUMMARY
This article discusses using joins in Microsoft Query. A join is a method
you can use with databases to define and illustrate relationships between
two tables. In Microsoft Query, you can create and manipulate various types
of joins. In many cases, Microsoft Query joins the tables in your query for
you. Joins are represented by lines connecting the tables in the Table
pane.
Topics covered in this article include the following:
What Is a Join?
Inner Joins
SQL Statements
Outer Joins
Subtract Joins
Full Outer Joins
Self Joins
Equi-Joins
Natural Joins
Cartesian Products
MORE INFORMATION
What Is a Join?
A join is a connection between two tables where the two tables are merged
according to a field that they have in common, creating a new virtual table
(which can be saved out as a real table). For example, with the following
two tables:
Color_Table:
Join_Field Color_Field
1 Red
2 Blue
3 Green
Pattern_Table:
Join_Field Pattern_Field
2 Striped
3 Checkered
4 Polka-Dot
a simple join resembles the following:
Join_Field Color_Field Pattern_Field
2 Blue Striped
3 Green Checkered
The result table contains only records 2 and 3 in the Join_Field field
because they are the only records that exist in both the Color_Table and
the Pattern_Table.
A practical example of a join is a retailer's list of products and
manufacturers; a Products table and a Vendors table could be joined on a
Product ID field.
The Inner Join
The join in the previous example, called an inner join, is the simplest
type of join. Usually, you want to use only a fraction of the fields in the
tables. For example, in a real inner join, such as the one above, you may
want to exclude Join_Field, so that the join looks like this:
Color_Field Pattern_Field
Blue Striped
Green Checkered
In Microsoft Query, inner joins are the default join type (for more
information, see page 105 in the "Microsoft Query User's Guide," version
1.0).
The SQL Statement
A Structured Query Language (SQL) SELECT statement is a type of macro that
you can use when you create a join. Note that SQL is very different from
Microsoft Excel's other macro languages (Visual Basic for Applications, and
the Excel 4.0 macro language).
It is NOT necessary to understand SQL in order to easily create joins in
Microsoft Query.
Every join has a SELECT statement associated with it. You can view the
SELECT Statement for any join in Microsoft Query by clicking "SQL" on the
toolbar. As with Microsoft Excel's macro recorder, you can use Query to
record the SELECT statement. Unlike Microsoft Excel's macro recorder, the
SELECT statement recorder is always on and cannot be turned off. Here's how
the SELECT statement might look in Microsoft Query for the above inner
join:
SELECT Color_Table.Color_Field, Pattern_Table.Pattern_Field
FROM c:\database.mdb Color_Table, c:\database.mdb
Pattern_Table
WHERE Color_Table.Join_Field = Pattern_Table.Join_Field
Note that .mdb is the file name extension for a Microsoft Access database,
which can have multiple tables in one file. In some other databases, such
as dBASE, Paradox, and FoxPro, each table must have its own file. In these
cases, the SQL syntax may appear redundant because the table name is always
the same as the file name without the extension.
SQL syntax varies among querying engines; for example, in Microsoft Access
the query from the above example resembles the following:
SELECT Color_Table.[Color_Field],
Pattern_Table.Pattern_Field
FROM Pattern_Table INNER JOIN Color_Table ON
Pattern_Table.[Join_Field] = Color_Table.[Join_Field];
The path to the table is not used in Microsoft Access because the table is
in a Microsoft Access .mdb file. Even if an external table is attached and
used in the query, the Microsoft Access SQL statement does not display the
path to the external table.
The Outer Join
Another kind of join is called an outer join. With an outer join, you get
all the records from one table and only those records from the other table
that have matching values from the first table. This may leave some of the
field entries blank, or "Null." For any two tables to be joined, there are
two possible outer joins, the "left outer join" and the "right outer join,"
(so named because you usually view the tables side-by-side). Using the
previous two tables in an example, the following is one of the two possible
outer joins:
Join_Field Color_Field Pattern_Field
1 Red (NULL)
2 Blue Striped
3 Green Checkered
The other possible join is as follows:
Join_Field Color_Field Pattern_Field
2 Blue Striped
3 Green Checkered
4 (NULL) Polka-Dot
NOTE: In an join, you do not see the word "NULL" when you view the data;
use the keyword "NULL" when you are working with joins.
In Microsoft Query, both kinds of outer join can be created easily using
the mouse (for more information about this procedure, see page 112 of the
"Microsoft Query User's Guide," version 1.0).
The following is how the SQL statement might look for the second example of
an outer join:
SELECT Color_Table.Color_Field, Pattern_Table.Pattern_Field
FROM {oj c:\database.mdb Color_Table LEFT OUTER JOIN
c:\database.mdb Pattern_Table ON Color_Table.Join_Field =
Pattern_Table.Join_Field}
To create a practical example of an outer join, make a list of a company's
products with sales figures for the products that had been sold, but not
excluding products that had not been sold. To do this, use a Product ID
field to join a Products table and a Sales table.
The Subtract Join
A third kind of join is the subtract join. A subtract join is the opposite
of an outer join; it includes ONLY those records in one table that don't
match any record in the other table. Like outer joins, there are two
possible subtract joins for any two tables that you want to join; however,
they are not usually referred to as "Left subtract join" or "Right subtract
join." A subtract join normally return fields from only one of the tables,
because by definition the other table's fields return only NULL values. The
following is one possible subtract join:
join_Field Color_Field
1 Red
and here is the other:
Join_Field Pattern_Field
4 Polka-Dot
In Microsoft Query, a subtract join is created by first creating an Outer
join, and then using the criteria "IS NULL" on an appropriate field
(Pattern_Field in the first example above; Color_Field in the second
example) to exclude records that match between the tables.
The following is how the SQL statement might look for the first subtract
join above:
SELECT Color_Table.Join_Field, Color_Table.Color_Field
FROM {oj c:\database.mdb Color_Table LEFT OUTER JOIN
c:\database.mdb Pattern_Table ON Color_Table.Join_Field =
Pattern_Table.Join_Field}
WHERE (Pattern_Table.Pattern_Field Is Null)
To create a practical example of a subtract join, a list of customers that
had not ordered recently. To do this, use an Order ID field to join a
Customers table and an Orders table.
The Full Outer Join
A fourth kind of join is the full outer join. A full outer join is a
combination of an outer join with its complimentary subtract join. A full
outer join includes all records from both tables and merges those records
that are common between the two tables. The following is a full outer join:
Join_Field Color_Field Pattern_Field
1 Red (NULL)
2 Blue Striped
3 Green Checkered
4 (NULL) Polka-Dot
In Microsoft Query, a full outer join is created by inserting the UNION
operator on its own line between the outer join's SELECT statement and
the subtract join's SELECT statement (see above).
To create a full outer join in Microsoft Query, use the following steps:
- Create an outer join, then click New on the File menu and create a
subtract join.
- Copy the subtract join's SQL.
- Switch to the outer join, type the word UNION on its own line below the
outer join's SQL statement, paste the subtract join's SQL below the word
UNION, and close the subtract join's window.
Here is how the SQL might look for the above full outer join (commands
below the UNION operator is pasted):
SELECT Color_Table.Join_Field, Color_Table.Color_Field,
Pattern_Table.Pattern_Field
FROM {oj c:\database.mdb Pattern_Table LEFT OUTER JOIN
c:\database.mdb Color_Table ON Color_Table.Join_Field =
Pattern_Table.Join_Field}
UNION
SELECT Color_Table.Join_Field, Color_Table.Color_Field,
Pattern_Table.Pattern_Field
FROM {oj c:\database.mdb Pattern_Table LEFT OUTER JOIN
c:\database.mdb Color_Table ON Color_Table.Join_Field =
Pattern_Table.Join_Field}
WHERE (Color_Table.Color_Field Is Null)
To create a practical example of a full outer join, merge overlapping
customer lists used by different departments, including fax numbers (which
were only on the first list), and Internet e-mail names (which are only in
the second list). Each department could continue to use its partial list
while having the complete, joined list available. They could be joined on a
Customer ID field.
The Self-Join
A fifth kind of join is the self-join. A self-join is a connections where a
field in a table is matched with a different field in a copy of the same
table. Using this example table:
Table_Three
Employee_ID Employee_Name Reports_To
1 Bob 3
2 Sue (NULL)
3 Jim 2
4 Jane 3
and a copy of it, as follows:
Table_Three_01
Employee_ID Employee_Name Reports_To
1 Bob 3
2 Sue (NULL)
3 Jim 2
4 Jane 3
A self-join could be used to create a list of employee names with their
supervisor's names. The Employee_ID in Table_Three would be joined with
the Reports_To in Table_Three_01. The following is how it might look at
first:
Employee_Name Employee_Name
Bob Jim
Sue (NULL)
Jim Sue
Jane Jim
However, because it is confusing to have the same field name for both
fields, change one of the field names, as follows:
Employee_Name Supervisor
Bob Jim
Sue (NULL)
Jim Sue
Jane Jim
The following is how the SQL might look for the above self-join:
SELECT table_three.Employee_Name,
table_three_01.Employee_Name 'Supervisor'
FROM c:\database.mdb table_three, c:\database.mdb
table_three_01
WHERE table_three.Employee_ID = table_three_01.Reports_To
When you return data to Microsoft Excel, it is of no use to rename the
field in Microsoft Query. This is true because Microsoft Excel uses the
original field name. For more information about this issue, please see the
following article in the Microsoft Knowledge Base:
Q121551 : XL5: Field Instead of Column Name in MSQUERY Returned to
Excel
A Microsoft Excel macro must change the column name every time the
returned data is refreshed (unless you return the data in a PivotTable, in
which case the Pivot itself can create and keep a custom field
name).
The Equi-Join and Natural Join
Almost all joins, including all examples given so far, are equi-joins and
natural joins. The meanings of these terms are of little significance to
the average user of Microsoft Query, but the next two paragraphs attempt
to explain the terms for those who may be curious.
The equi-join is a join in which records are retrieved based on whether the
join fields have matching values in both tables. That may seem like just a
definition of a join, but it isn't. An example of a non-equal join is a
join where records in the first table are joined to those records in the
second table where the joined field in the first table is greater than
(instead of equal to) the joined field in the second table (or less than,
or anything besides equal to). Naturally, this returns more records than an
equi-join.
A natural join is one in which only one of the two tables' joined
fields is returned. Since these two fields are by definition identical in
an equi-join, it is redundant to include both. For
a non-equal join, it is important to include both of those fields.
So, equi-joins and natural joins go together. You would want an equi-
joins (which describes most joins) to be natural join by returning only one
of the joined fields; but, if you ever use a non-equal join, you might also
want to make it a non-natural join by returning both of the joined fields.
There are other kinds of joins. The full spectrum of joins was most
recently defined in 1992 and this standard is known as SQL-92. Some joins
are not important to Microsoft Excel users because these joins do things
that are easier to do in Microsoft Excel.
The Cartesian Product
Trying to return data from two or more tables without any joins creates
what is called a "Cartesian product." A Cartesian product is defined as all
possible combinations of rows in all tables. Be sure you have joins before
trying to return data, because a Cartesian product on tables with many
records and/or on many tables could take several hours to complete. The
following is a Cartesian product as used on the two example tables; note
that this table is only 3 records times 3 records, which yields a total of
9 records. However, imagine if instead, the table was 100 records times
1,000 records times 10,000 records; then the table would contain
1,000,000,000 records!
Join_Field Join_Field Color_Field Pattern_Field
1 2 Red Striped
1 3 Red Checkered
1 4 Red Polka-Dot
2 2 Blue Striped
2 3 Blue Checkered
2 4 Blue Polka-Dot
3 2 Green Striped
3 3 Green Checkered
3 4 Green Polka-Dot
Occasionally, some users want to use a Cartesian product; however, most
users who get them do so accidentally, and are often confused by them.
Because most users exclude most of the fields in a join, a real Cartesian
product can easily look as perplexing as this:
Color_Field
Red
Red
Red
Blue
Blue
Blue
Green
Green
Green
If 100 records are added to Pattern_Table, this query would have 309
records (103 records each of Red, Blue, and Green).
Cartesian products have relatively simple SELECT statements. The following
is how the SQL might look for the above Cartesian product:
SELECT Color_Table.Color_Field, Pattern_Table.Pattern_Field
FROM c:\database.mdb Color_Table, c:\database.mdb
Pattern_Table
A practical example of a Cartesian product would be to create a list of
all possible combinations of options on a merchandise product, with price
totals for each combination.
REFERENCES
Microsoft Query "User's Guide," version 1.0, pages 101-114, 123-131
The following two books are not included with any Microsoft products and
are not produced by Microsoft.
The third-party products discussed here are manufactured by vendors
independent of Microsoft; we make no warranty, implied or otherwise,
regarding these products' performance or reliability.
"Understanding the New SQL: A Complete Guide," Morgan Kaufmann Publishers,
Inc., 1993.
"Joe Celko's SQL for Smarties: Advanced SQL Programming," Morgan Kaufmann
Publishers, Inc., 1995.
For more information about creating joins in Microsoft Query, choose the
Search button in Help and type:
joins, overview
Additional query words:
Keywords : xlquery
Version : WINDOWS:1.0,2.0; MACINTOSH:1.0,98
Platform : MACINTOSH WINDOWS
Issue type :
Last Reviewed: April 14, 1999