• Quickly Eliminate Duplicate Records From A Specific Column

    Macro Purpose:
    • Function to quickly remove all duplicates from a specified column in Excel using VBA.
    Examples of where this function shines:
    • Assume that you have been given a list of customer data, but there are multiple records for each customer. You only need one, so need to delete the rest of the duplicate information.
    • Many people will use VBA to loop through all the records, deleting any that occur more than once. Unfortunately, loops take more and more time as the data increases. This code is MUCH faster than using a loop to accomplish the task.
    • This function keeps the first entry in the list of data, so your records order is kept intact
    Macro Weakness(es):
    • You may not want to keep the last entry in your list of data, not the first. Currently this would require sorting your data before running this function
    Versions Tested:
    This function has been tested with Excel 2003. It should not have any issues running from any of the Office applications from 97 or higher.

    VBA Code Required:

    Place the following code in a standard module:

    Code:
    Option Explicit
    Function FilterDelete(TargetColumn As Range)
    'Author       : Ken Puls (www.excelguru.ca)
    'Macro Purpose: To quickly eliminate duplicates from specified column
    '   Intended for use with data lists with/without header information
    
    Dim lLastRow As Long
    Dim lLastCol As Long
    
    'Check if multiple columns provided and exit if so
    If TargetColumn.Columns.Count <> 1 Then Exit Function
    
    With TargetColumn.Parent
        'Determine last row and last column
        lLastRow = .Cells.Find(What:="*", After:=.Range("A1"), LookIn:=xlValues, LookAt:= _
            xlPart, SearchOrder:=xlByRows, SearchDirection:=xlPrevious).Row
        lLastCol = .Cells.Find(What:="*", After:=.Range("A1"), LookIn:=xlValues, LookAt:= _
            xlPart, SearchOrder:=xlByColumns, SearchDirection:=xlPrevious).Column
    
        'Set up an index column of ascending numbers after the last column
        .Cells(1, lLastCol + 1).Value = 1
        .Range(.Cells(2, lLastCol + 1), .Cells(lLastRow, lLastCol + 1)).FormulaR1C1 = "=R[-1]C+1"
        .Columns(lLastCol + 1).Cells.Copy
        .Columns(lLastCol + 1).Cells.PasteSpecial Paste:=xlValues
    
        'Sort the records by the column specified in ascending order
        .Range(.Cells(1, 1), .Cells(lLastRow, lLastCol + 1)).Sort _
            Key1:=TargetColumn, Order1:=xlAscending, _
            Key2:=.Columns(lLastCol + 1)
    
        'Set up an formula column at end to determine if each rows record matches
        'the previous rows record. If so, mark it 0, otherwise 1
        .Cells(1, lLastCol + 2).Value = 0
        .Range(.Cells(2, lLastCol + 2), .Cells(lLastRow, lLastCol + 2)).FormulaR1C1 = _
            "=if(RC[" & TargetColumn.Column - (lLastCol + 2) & "]=R[-1]C[" & TargetColumn.Column - (lLastCol + 2) & "],1,0)"
        .Columns(lLastCol + 2).Cells.Copy
        .Columns(lLastCol + 2).Cells.PasteSpecial Paste:=xlValues
    
        'Sort the records by the match column.  Eliminates complex ranges in large data sets that create errors
        .Range(.Cells(1, 1), .Cells(lLastRow, lLastCol + 2)).Sort _
            Key1:=.Cells(1, lLastCol + 2)
    
        'Autofilter and delete all cells showing a 1 as they are duplicate values
        With .Range(.Cells(1, 1), (.Cells(lLastRow, lLastCol + 2)))
            .AutoFilter
            .AutoFilter field:=lLastCol + 2, Criteria1:="1"
        End With
        .Range(.Cells(2, 1), .Cells(lLastRow, lLastCol + 2)).SpecialCells(xlCellTypeVisible).EntireRow.Delete
        .AutoFilterMode = False
    
        'Resort the data back to the original order
        .Range(.Cells(1, 1), .Cells(.Rows.Count, lLastCol + 2).End(xlUp)).Sort _
            Key1:=.Cells(1, lLastCol + 1)
    
        'Remove index columns created for duplicate removal
        .Range(.Cells(1, lLastCol + 1), .Cells(1, lLastCol + 2)).EntireColumn.Delete
    End With
    End Function
    How to use the code:
    • Call the function from another procedure.
    • For example, to delete all duplicates in column C, place the following code in a standard module and run it:
    Code:
    Sub UseIt()
    'Author: Ken Puls (www.excelguru.ca)
    'Turn off screen updating for speed
        Application.ScreenUpdating = False
    
        'Use the function to eliminate all duplicates in column C
        FilterDelete (ActiveSheet.Range("C1"))
        'Restore screen updating
        Application.ScreenUpdating = True
    End Sub
    How it works:
    • A column of index numbers is created after the last column of information in the spreadsheet. This is to preserve the original order.
    • The data is then sorted based on the specified column, with the secondary sort key being the original order. (This ensures that the first instance of a record will be maintained.)
    • A column of formulas is inserted at the end of the sheet, comparing the Target Column's data for that row against the previous row. If it does not match, it is a new record, if it does match, then the record is a duplicate.
    • Using the autofilter, all data that has been flagged as a duplicate is deleted.
    The End Result:
    • All duplicates will be eliminated from the specified column.

     

    I'm afraid that you must be logged in to comment or leave a testimonial. I wish it could be otherwise, but I'm trying to keep my site spam free for everyone's benefit. If you don't yet have an account it's completely free to sign up, and a very quick process. Simply click here to Register. Not only can you post a comment here, but it gives you full access to posts questions in our forum as well!

     

    If you already have an account, and just haven't logged in yet, what are you waiting for? Login Now!