- Function to quickly remove all duplicates from a specified column in Excel using VBA.
Examples of where this function shines:
- Assume that you have been given a list of customer data, but there are multiple records for each customer. You only need one, so need to delete the rest of the duplicate information.
- Many people will use VBA to loop through all the records, deleting any that occur more than once. Unfortunately, loops take more and more time as the data increases. This code is MUCH faster than using a loop to accomplish the task.
- This function keeps the first entry in the list of data, so your records order is kept intact
Macro Weakness(es):
- You may not want to keep the last entry in your list of data, not the first. Currently this would require sorting your data before running this function
Versions Tested:
This function has been tested with Excel 2003. It should not have any issues running from any of the Office applications from 97 or higher.
VBA Code Required:
Place the following code in a standard module:
Option Explicit Function FilterDelete(TargetColumn As Range) 'Author : Ken Puls (www.excelguru.ca) 'Macro Purpose: To quickly eliminate duplicates from specified column ' Intended for use with data lists with/without header information Dim lLastRow As Long Dim lLastCol As Long 'Check if multiple columns provided and exit if so If TargetColumn.Columns.Count <> 1 Then Exit Function With TargetColumn.Parent 'Determine last row and last column lLastRow = .Cells.Find(What:="*", After:=.Range("A1"), LookIn:=xlValues, LookAt:= _ xlPart, SearchOrder:=xlByRows, SearchDirection:=xlPrevious).Row lLastCol = .Cells.Find(What:="*", After:=.Range("A1"), LookIn:=xlValues, LookAt:= _ xlPart, SearchOrder:=xlByColumns, SearchDirection:=xlPrevious).Column 'Set up an index column of ascending numbers after the last column .Cells(1, lLastCol + 1).Value = 1 .Range(.Cells(2, lLastCol + 1), .Cells(lLastRow, lLastCol + 1)).FormulaR1C1 = "=R[-1]C+1" .Columns(lLastCol + 1).Cells.Copy .Columns(lLastCol + 1).Cells.PasteSpecial Paste:=xlValues 'Sort the records by the column specified in ascending order .Range(.Cells(1, 1), .Cells(lLastRow, lLastCol + 1)).Sort _ Key1:=TargetColumn, Order1:=xlAscending, _ Key2:=.Columns(lLastCol + 1) 'Set up an formula column at end to determine if each rows record matches 'the previous rows record. If so, mark it 0, otherwise 1 .Cells(1, lLastCol + 2).Value = 0 .Range(.Cells(2, lLastCol + 2), .Cells(lLastRow, lLastCol + 2)).FormulaR1C1 = _ "=if(RC[" & TargetColumn.Column - (lLastCol + 2) & "]=R[-1]C[" & TargetColumn.Column - (lLastCol + 2) & "],1,0)" .Columns(lLastCol + 2).Cells.Copy .Columns(lLastCol + 2).Cells.PasteSpecial Paste:=xlValues 'Sort the records by the match column. Eliminates complex ranges in large data sets that create errors .Range(.Cells(1, 1), .Cells(lLastRow, lLastCol + 2)).Sort _ Key1:=.Cells(1, lLastCol + 2) 'Autofilter and delete all cells showing a 1 as they are duplicate values With .Range(.Cells(1, 1), (.Cells(lLastRow, lLastCol + 2))) .AutoFilter .AutoFilter field:=lLastCol + 2, Criteria1:="1" End With .Range(.Cells(2, 1), .Cells(lLastRow, lLastCol + 2)).SpecialCells(xlCellTypeVisible).EntireRow.Delete .AutoFilterMode = False 'Resort the data back to the original order .Range(.Cells(1, 1), .Cells(.Rows.Count, lLastCol + 2).End(xlUp)).Sort _ Key1:=.Cells(1, lLastCol + 1) 'Remove index columns created for duplicate removal .Range(.Cells(1, lLastCol + 1), .Cells(1, lLastCol + 2)).EntireColumn.Delete End With End Function
- Call the function from another procedure.
- For example, to delete all duplicates in column C, place the following code in a standard module and run it:
Sub UseIt() 'Author: Ken Puls (www.excelguru.ca) 'Turn off screen updating for speed Application.ScreenUpdating = False 'Use the function to eliminate all duplicates in column C FilterDelete (ActiveSheet.Range("C1")) 'Restore screen updating Application.ScreenUpdating = True End Sub
- A column of index numbers is created after the last column of information in the spreadsheet. This is to preserve the original order.
- The data is then sorted based on the specified column, with the secondary sort key being the original order. (This ensures that the first instance of a record will be maintained.)
- A column of formulas is inserted at the end of the sheet, comparing the Target Column's data for that row against the previous row. If it does not match, it is a new record, if it does match, then the record is a duplicate.
- Using the autofilter, all data that has been flagged as a duplicate is deleted.
The End Result:
- All duplicates will be eliminated from the specified column.
Rate this article