Results 1 to 5 of 5

Thread: Filtering Large Amounts of Records by Color

  1. #1

    Filtering Large Amounts of Records by Color



    Register for a FREE account, and/
    or Log in to avoid these ads!

    Hello,

    I have two columns of data each over 100K records. I need to compare the two columns in order to find duplicates and then pull out the unique records. I approached this problem by using conditional formatting to identify the duplicates. Now that all the duplicates cells are colored, I just wanted to filter by color and filter for the no fill cells.

    I am finding this incredibly slow, I know that I can expect it to be slow with the amount of records but, filtering by color actually ends up freezing excel. I even left it running overnight with no results.

    I will say I had to use the concatenate function for these values because the source was a .csv and I needed to add zeros at the beginning of a lot of records. I pasted the values in a new sheet but they are now numbers stored as text.

    Any suggestions?

  2. #2
    Magician Hercules1946's Avatar
    Join Date
    Mar 2013
    Location
    York, England
    Posts
    794
    Articles
    0
    Excel Version
    2010
    Its a bit difficult to advise you without seeing your data to understand why you are trying to use conditional formatting, but these points might help.
    1. You can add leading zeros to a number without ending up with a character format. If you have a column of numbers with differing numbers of digits qnd you want to make them all (say) 10 digits long using leading zeros, select "Custom" format, and enter the Type as "000000000"#.
    2. By using the Advanced Filter on the Data ribbon, you can get Excel to filter out all the duplicates in a column automatically. You can also create a copy of the unique entries in another column. If you do this for both your columns this should make the matching part easier.

  3. #3
    Acolyte candybg's Avatar
    Join Date
    Jul 2015
    Posts
    92
    Articles
    0
    Excel Version
    2010 and 2016
    I think a pivot table would work for finding the duplicates and unique values. If you upload a sample spreadsheet, I'd be glad to set it up for you.

    Regarding the leading zeros, are you sure you need them?

  4. #4
    Magician Hercules1946's Avatar
    Join Date
    Mar 2013
    Location
    York, England
    Posts
    794
    Articles
    0
    Excel Version
    2010
    Could you clarify a couple of points:

    Does the data your working on just have the two columns, or is there other data that needs to be kept with it ?
    Are the numbers in each separate column unique, or are there duplicates in addition to those between the two columns?

  5. #5
    Magician Hercules1946's Avatar
    Join Date
    Mar 2013
    Location
    York, England
    Posts
    794
    Articles
    0
    Excel Version
    2010
    Well, it seems that we might have lost Tyt but I think that its worth posting this rather neat solution, using the dictionary object in a bit of VBA code.
    Its well suited to this task, as in coming up with a unique list from both coulmns, it can cope with duplicates and out of sequence and is ideal for large amounts of data
    as its very fast.
    To copy the code below into your own workbook, hit ALT+F11 to open the editor, followed by Insert/Module from the menu, and paste the code into the module window that opens.
    Quit the Editor (ALT+Q) and save your workbook, To run the code at any time, with the worksheet open, simply hit ALT+F8 and run from the window that opens.
    There could be a (one time) tiny overhead if the MS Scripting Library is not enabled in VBA. If it complains, from VB Editor menu, select Tools, References and scan the list for the "Microsoft Scripting Runtime" Library. Tick Its box, Click OK, and then
    Alt+Q to close the Editor. Save your workbook, and run the code again.

    In the attachment, just open the worksheet, and run the code.

    Code:
    Option Explicit
    Option Base 1
    
    Sub UniqueValues()
        Dim dict As Scripting.Dictionary
        Set dict = CreateObject("scripting.dictionary")
        Dim c
        Dim rng1 As Range
        Dim rng2 As Range
        Dim lrow1 As Long
        Dim lrow2 As Long
        
    With Sheets("UniqueNumbers")
    lrow1 = .Cells(Rows.Count, "C").End(xlUp).Row   ')
    lrow2 = .Cells(Rows.Count, "D").End(xlUp).Row   ') Amend "C" and "D" in the range reference to your actual columns.
    Set rng1 = .Range("C2:C" & lrow1)               ') Also amend the start row if it isn't 2
    Set rng2 = .Range("D2:D" & lrow2)               ')
    
    End With
    With dict
        For Each c In rng1.Value
            .Item(c) = 0
        Next c
        For Each c In rng2.Value
            .Item(c) = 0
        Next c
    
    End With
    Sheets("UniqueNumbers").Select
    With dict              'sorted Unique values output to E2 and down
        Range("E2").Resize(.Count) = Application.Transpose(.Keys)
        
    End With
    With Sheets("UniqueNumbers")
    lrow1 = .Cells(Rows.Count, "E").End(xlUp).Row
    .Range("E2:E" & lrow1).Sort key1:=.Range("E2"), order1:=1
      .Columns.AutoFit
      .Activate
    End With
    
    End Sub
    Attached Files Attached Files

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •