Removing Duplicate Records using SQL Server 2005

Duplicate records can occur numerous ways, such as loading source files too many times, keying the same data more than once, or from just bad database coding. Having a primary key on your table (and you always should have one) can will in the removal of the duplicate records, but even a primary key it is never a fun task to have handed to you to complete.

I will demonstrate how you can use a common-table expression (CTE) in 2005 to easily remove duplicate entries from a table:

;WITH PendingCalculationCTE(EmployeeId, Start, Finish, [Timestamp], TransactionId, LockFlag, Ranking)
AS(
SELECT EmployeeId, Start, Finish, [Timestamp], TransactionId, LockFlag,
Ranking = DENSE_RANK()
OVER(PARTITION BY EmployeeId, Start, Finish, [Timestamp], TransactionId, LockFlag
ORDER BY NEWID() ASC)
FROM PendingCalculation
)

DELETE FROM PendingCalculationCTE WHERE Ranking > 1

 The script above defines my CTE. I am using a windowing function named DENSE_RANK to group the records together based on the EmployeeId, Start, Finish, [Timestamp], TransactionId, LockFlag fields, and assign them a sequential value randomly. This means that if I have two records with the exact same EmployeeId, Start, Finish, [Timestamp], TransactionId, LockFlag values, the first record will be ranked as 1, the second as 2, and so on.
Because a CTE acts as a virtual table, I am able to process data modification statements against it, and the underlying table will be affected. In this case, I am removing any record from the PendingCalculationCTE that is ranked higher than 1. This will remove all of my duplicate records.

DENSE_RANK – Returns the rank of rows within the partition of a result set, without any gaps in the ranking. The rank of a row is one plus the number of distinct ranks that come before the row in question.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s