Methods of Deleting Duplicate Records from a Database
ID: Q104332
|
The information in this article applies to:
-
Microsoft FoxPro for Windows, versions 2.5, 2.5a
-
Microsoft FoxPro for MS-DOS, versions 1.02, 2.0, 2.5, 2.5a
-
Microsoft FoxBASE+ for MS-DOS, version 2.01
-
Microsoft FoxBASE+ for Macintosh, version 2.01
SUMMARY
You can use a number of methods to eliminate duplicate records from a
database. The method you use will be determined by the specific
product being used, whether making a duplicate of the original table
is feasible, and the amount of disk space available.
The three methods can be summarized as follows:
- Process database sequentially, determining if records are
duplicates and manually deleting duplicate records. This method
requires writing a program.
- Create a unique index and copy the records to a new file that
contains only unique records. This method can be performed without
writing a program.
- In FoxPro 2.0 or later, perform a SELECT DISTINCT command, which
will create a new table with only unique records. This method can
be performed without writing a program.
MORE INFORMATION
Method 1 - Process Database Sequentially
This method can be used with FoxBASE+ as well as FoxPro. It requires
the least amount of disk space to perform the deletions, although it
requires a PACK command after all deletions have been completed. The
PACK command can require 3-5 times the database size in disk space.
- Create a program named DUPE.PRG with the following commands:
* Memory variables -
* counter = counter used to cycle through fields
* firstrec = array containing first record to test
* secrec = array containing second record to test
USE <filename>
* The following command would be modified to include all fields
* in the data table that make a record unique.
* Some fields may have to be converted to character fields.
INDEX ON field1 + field2 TO temp.idx
SET INDEX TO temp.idx
GO TOP
* Use two arrays to see if data has changed.
* This program requires that all fields in each record be
* identical in order for a record to be considered a duplicate.
* If fewer fields were required to define a record as a duplicate,
* the arrays would be built with only those fields required to
* define a record as a duplicate.
DO WHILE .NOT. EOF() && In FoxPro 2.0 and later, DO WHILE can be
SCATTER TO firstrec && replaced by more efficient SCAN.ENDSCAN
counter = 1
SKIP
SCATTER TO secrec
* Compare each array element to see if other record differs.
DO WHILE counter < FCOUNT()
IF firstrec(counter) <> secrec(counter)
EXIT
ENDIF
DELETE
EXIT
ENDDO
ENDDO
- Issue the following command:
DO dupe.prg
- Pack the database if desired.
Method 2 - Create a Unique Index and Copy the Records to a New File
This method can be used with FoxBASE+ as well as FoxPro.
- Issue the following commands:
USE <database>
SET UNIQUE ON
- Create a temporary, unique index. Modify the following command as
necessary to include all the fields that make a unique record.
INDEX ON field1 + field2 TO TEMP.IDX
SET INDEX TO temp.idx
- If you are not using FoxPro 2.0 or later, or if maintaining the
index expressions in the current .CDX file is not required, create
a new file as follows. Otherwise, go to step 4.
COPY TO newtable
USE newtable
- If FoxPro 2.0 or later is being used and it is desirable to
maintain the structure of the original .CDX file, create a new file
as follows. Otherwise, go to step 5.
COPY TO newtable WITH CDX
USE newtable
- Copy or rename the new table, memo file, and .CDX to the original
name. If you are renaming, delete the original file(s) first.
- Issue the following command:
SET UNIQUE OFF
- Delete TEMP.IDX if desired.
Method 3 - In FoxPro 2.0 or Later, Use SELECT
DISTINCT to Create New Table with Only Unique Records
- Issue the following command:
USE <filename>
- Do one of the following:
- Create a new query by choosing New Query from the Run menu,
selecting the Fields check box, selecting the No Duplicates
check box, and choosing OK. From the Output list box, choose
Table/DBF and type "newtable" (without the quotation marks) in
the Name box. Choose the Do Query button to execute the query.
-or-
- Issue the following command in the Command window:
SELECT DISTINCT * from <filename> INTO TABLE newtable
- If maintaining the index expressions in the current .CDX file is
not required, create a new file as follows. Otherwise, go to step
4.
COPY TO newtable
USE newtable
- If you want to maintain the structure of the original .CDX file,
create a new file as follows. Otherwise, go to step 5.
COPY TO newtab2 WITH CDX
USE newtab2
APPEND FROM newtable
- Copy or rename the new table, memo file, and .CDX to the original
name. If you are renaming, delete the original file(s) first.
Additional query words:
FoxDos FoxWin remove sql
Keywords : kbcode FxprgSql
Version : 2.50 2.50a | 1.02 2.00 2.50 2.50
Platform : MS-DOS WINDOWS
Issue type :
Last Reviewed: August 12, 1999