ACC1x: "Unique Values Only" Select Query Shows Duplicates

ID: Q93692


The information in this article applies to:


SYMPTOMS

The result of a query includes duplicate values even though you selected the Unique Values Only check box (SELECT DISTINCT) in the Query Properties dialog box.


CAUSE

Duplicate values occur when both of the following conditions are met:

  1. The query input data set contains more than approximately 64K of data and the duplicate values are widely scattered in the input data.

    For example, a SELECT DISTINCT query on a text column with an average length of 20 characters will not display duplicate values unless the query input data set included more than 3,300 values (65536 / 20 = 3276.8. Note that if the query is based on a table that has 10,000 rows, but the specified criteria retrieves only 20 rows, it is the set of 20 rows that determines the size of the query input data set. In other words, it is the WHERE clause that determines the size of the query input data set.


  2. The values that have duplicates appear close (maybe in the first dozen values) to what would normally be the beginning of the query output. By default, a SELECT DISTINCT query displays data sorted implicitly. For example, the following displays the result sorted by Order ID:

    
          SELECT DISTINCT [Order Details].[Order ID]
          FROM [Order Details]; 


    However, in a SELECT DISTINCT query that uses more than one column and sorts by a column other than the first one, the duplicates could appear anywhere in the output. To see if there are duplicates, you need to remove the sort.


NOTE: This problem also appears in a Crosstab query as duplicated columns with names like Field1. Use Fixed Column heading to prevent the problem.


RESOLUTION

In addition to SELECT DISTINCT queries, you can use GROUP BYs in a totals query to eliminate duplicates in the output. Both SELECT DISTINCT queries and GROUP BYs in a totals query are limited to 10 columns. Usually, GROUP BY is faster when the resulting data set is much smaller than the input data set, and SELECT DISTINCT is faster when there are only a few duplicates.

To ensure the correct results, use GROUP BYs in a totals query instead of using the Unique Values Only query property. For example, the following two select queries give identical results:


   SELECT DISTINCT [Order Details].[Order ID]
   FROM [Order Details];

   SELECT DISTINCTROW [Order Details].[Order ID]
   FROM [Order Details]
   GROUP BY [Order Details].[Order ID]; 


STATUS

Microsoft has confirmed this to be a problem in Microsoft Access version 1.0. This problem was corrected in Microsoft Access version 1.1.


REFERENCES

Microsoft Access "User's Guide," version 1.0, Chapter 6, "Designing Select Queries," pages 128-130 and 142-148


Keywords          : kbusage QryOthr 
Version           : 1.0
Platform          : WINDOWS 
Issue type        : kbbug 

Last Reviewed: March 12, 1999