Find a multi string match in Column A

Shazam

New member
Joined
Feb 27, 2019
Messages
12
Reaction score
0
Points
0
Excel Version(s)
Microsoft office plus pro 2016
Hello everyone!

I'm hoping you can help me with this.

Problem:
In Column A I have multiple strings, for example:

1. 20D Fun 0L
2. 10D Fun 0L
3. 0L Fun 20D

As you can see, 1. and 2. have the same information.

Solution:
Ideally I would like the formula to search Column A and find cells which have 3 matching strings. In Column B, the result should output the matching cell. For example, in the data I presented, the result for 1. would be 3.

I appreciate any help.
 
Here's some sample data and how I'd like the results to appear. There only exists 1 duplicate for each cell but the strings are in different order.
 

Attachments

  • Book1.xlsx
    10.5 KB · Views: 12
I found this and it works for what I need, but how do I get it to list through the entire Column A and how would I get it to output the results in column B or at the very least highlight all the matching results in column A?

Link: https://stackoverflow.com/questions/...order-in-excel
 
Last edited:
Last edited:
This macro is in the attached and can be run by clicking on Button 1 near cell F2 on the sheet.
Code:
Sub blah()
Set rngToCheck = Range("A2:A35")    'adjust this as required.
For Each cll In rngToCheck.Cells
  ofset = 1
  cllWords = Split(Application.Trim(cll.Value))
  For Each celle In rngToCheck.Cells
    myCount = 0
    If cll.Address <> celle.Address Then
      celleWords = Split(Application.Trim(celle.Value))
      For Each cllWord In cllWords
        For Each celleWord In celleWords
          If LCase(cllWord) = LCase(celleWord) Then myCount = myCount + 1
          If myCount > 2 Then
            celle.Copy cll.Offset(, ofset)
            ofset = ofset + 1
            FoundOne = True
            Exit For
          End If
        Next celleWord
        If FoundOne Then
          FoundOne = False
          Exit For
        End If
      Next cllWord
    End If
  Next celle
Next cll
End Sub
It looks for 3 matching words (any order). If it finds more than one other matching cell it adds those cell values in more columns to the right.

Of course, you may need to adjust the range "A2:A35" in the code.
 

Attachments

  • ExcelGuru9846Book1.xlsm
    19.4 KB · Views: 10
Last edited:
Shazam, for your information, you should always provide links to your cross posts.
https://www.mrexcel.com/forum/excel-questions/1089037-multiple-string-matching.html
This is a requirement, not just a request.
If you have cross posted at other places, please add links to them too.
Why? Have a read of http://www.excelguru.ca/content.php?184

Sorry about that, I'll make sure I don't repeat that mistake.

Here's another link
https://www.excelforum.com/excel-fo...11-find-a-multi-string-match-in-column-a.html


This macro is in the attached and can be run by clicking om Button 1 near cell F2 on the sheet.
Code:
Sub blah()
Set rngToCheck = Range("A2:A35")    'adjust this as required.
For Each cll In rngToCheck.Cells
  ofset = 1
  cllWords = Split(Application.Trim(cll.Value))
  For Each celle In rngToCheck.Cells
    myCount = 0
    If cll.Address <> celle.Address Then
      celleWords = Split(Application.Trim(celle.Value))
      For Each cllWord In cllWords
        For Each celleWord In celleWords
          If LCase(cllWord) = LCase(celleWord) Then myCount = myCount + 1
          If myCount > 2 Then
            celle.Copy cll.Offset(, ofset)
            ofset = ofset + 1
            FoundOne = True
            Exit For
          End If
        Next celleWord
        If FoundOne Then
          FoundOne = False
          Exit For
        End If
      Next cllWord
    End If
  Next celle
Next cll
End Sub
It looks for 3 matching words (any order). If it finds more than one other matching cell it adds those cell values in more columns to the right.

Of course, you may need to adjust the range "A2:A35" in the code.

Thanks for trying out the solution, unfortunately the results are not what I'm looking for. I may not have explained clearly what solution I'm after.

In the excel file I provided, column B is just an example of how I'd like the results to appear.

Ex. B2 is the answer/result for A2 because it has 3 matching strings in A17, then a17 was copied over to B2 as the answer. Hope this makes more sense.
 
In what way is my offering not giving the same results?
Tomorrow, I'll run the macro again, post screenshot of the results and ask you to point out the differences between your column B and mine...
 
In what way is my offering not giving the same results?
Tomorrow, I'll run the macro again, post screenshot of the results and ask you to point out the differences between your column B and mine...

I really appreciate your help. I've attached a screenshot to show the results I'm getting when I click the button.
 

Attachments

  • excelguru.PNG
    excelguru.PNG
    48.6 KB · Views: 14
Screenshot of macro blah run on your file attached to msg#2 with results in column B.
Your desired results in column C.
Apart from the NA entries, what's the difference?
2019-02-28_114415.jpg
 
Screenshot of macro blah run on your file attached to msg#2 with results in column B.
Your desired results in column C.
Apart from the NA entries, what's the difference?
View attachment 8938

Oh sorry I thought column B info was already there because it looked the same :loco:

Thanks so much for this!
I didn't notice this before but is there a way to adjust it to look through up to 7 strings? The file I have has between 3 to 7 strings and the current code malfunctions when it gets to those.
 
Actually between 2-8 strings :)

So basically there are duplicates and I need to get rid of them but there's thousands. I would need it to find any 2 strings matching with other 2 strings, 3 strings with other 3 (as we have now) etc.
 
Last edited:
I didn't notice this before but is there a way to adjust it to look through up to 7 strings? The file I have has between 3 to 7 strings and the current code malfunctions when it gets to those.
Supply a file with such sample data and it going wrong.
What's the error message?

ps. hint:
If myCount > 2 Then
searches for 3 matching strings

it then follows that
If myCount > 3 Then
might search for 4 matching strings and

If myCount > 7 Then
could search for 8 matching strings?

But note that if 3 matching strings are sought, the results will have to include 4,5,6,7 & 8 matching strings too (if a pair of cells contain 7 matching strings, they also contain at least 3 matching strings).
 
Last edited:
Supply a file with such sample data and it going wrong.
What's the error message?

ps. hint:
If myCount > 2 Then
searches for 3 matching strings

it then follows that
If myCount > 3 Then
might search for 4 matching strings and

If myCount > 7 Then
could search for 8 matching strings?

But note that if 3 matching strings are sought, the results will have to include 4,5,6,7 & 8 matching strings too (if a pair of cells contain 7 matching strings, they also contain at least 3 matching strings).

Ahh, ok I just played around with it and I see what it's doing. It fills in additional rows because if I put 2 matching, then even if that string has 7 it will output multiple duplicates since it's looking for any 2 matches despite if it already found 1 match.
So you did it perfectly based on what I asked for :) However I would like it to have some limitations. Is it possible to search through the list, and if a cell has 2 strings, it will only look for a duplicate within 2 string cells, if it has 3 strings, only look through 3 string cells etc.
Right now even if a cell has 4 strings, it'll add in matches for 5, 6, and 7 strings that have the duplicate within them.

In this example, Row 1 has 3 strings so it will look for another 3 string match meaning it will only match with Row 4 since they have the same AMOUNT of strings. Since there will always only be 1 match it will then look at row 2 and search for a match with the same AMOUNT of strings so in this case row 4 would match only, row 5 would match 7, row 6 would match 8 and so on. Also instead of outputting the result in column B as "Sample data 0mg" instead if it could output "1" for the first match (or the first input), "2" for the second match, and "No duplicate" for no duplicates.

1. Sample data 0mg
2. Sample data 0mg 100L
3. data Sample 0mg
4. data 100L Sample 0mg

5. Sample data 0mg 100L 20g fair
6. Sample data 0mg 200L 20g unfair
7. data Sample 0mg 20g 100L
8. data Sample 0mg 20g 200L

Thanks again for all your help!
 
Last edited:
Is it possible to search through the list, and if a cell has 2 strings, it will only look for a duplicate within 2 string cells, if it has 3 strings, only look through 3 string cells etc.
<snip>
In this example, Row 1 has 3 strings so it will look for another 3 string match meaning it will only match with Row 4 since they have the same AMOUNT of strings.
?!?! But row 4 has 4 strings!



row 5 would match 7,
Row 5 has 6 strings, row 7 has only 5!



row 6 would match 8
Row 6 has 6 strings, row 8 has only 5!



1. Can you confirm what you say above is 100% correct? If you can confirm, you'll have to explain some more.

2. Would it be correct to say the following:
You want a match to be identified if all the strings in one cell match all the strings in another cell, regardless of order.
 
?!?! But row 4 has 4 strings!



Row 5 has 6 strings, row 7 has only 5!



Row 6 has 6 strings, row 8 has only 5!



1. Can you confirm what you say above is 100% correct? If you can confirm, you'll have to explain some more.

2. Would it be correct to say the following:
You want a match to be identified if all the strings in one cell match all the strings in another cell, regardless of order.

I mean 1 and 3 should match, 2 and 4 should match.

5 and 7 should have the same amount of strings, additional string is an accident, same with 6 and 8.

Yes you are right about "You want a match to be identified if all the strings in one cell match all the strings in another cell, regardless of order."
 
Try:
Code:
Sub blah()
Set rngToCheck = Range("A2:A35")    'adjust this as required.
rngToCheck.Offset(, 1).Value = "No Duplicate"
For Each cll In rngToCheck.Cells
  If cll.Offset(, 1).Value = "No Duplicate" Then
    MatchNo = 2
    cllWords = Split(Application.Trim(cll.Value))
    cllWordCount = UBound(cllWords) + 1
    If cllWordCount > 2 Then
      Set remainingCells = Nothing
      On Error Resume Next
      Set remainingCells = Intersect(rngToCheck, cll.Offset(1).Resize(rngToCheck.Rows.Count))
      On Error GoTo 0
      If Not remainingCells Is Nothing Then
        For Each celle In remainingCells
          celleWords = Split(Application.Trim(celle.Value))
          celleWordCount = UBound(celleWords) + 1
          If celleWordCount = cllWordCount Then
            myCount = 0
            For Each cllWord In cllWords
              For Each celleWord In celleWords
                If LCase(cllWord) = LCase(celleWord) Then myCount = myCount + 1
              Next celleWord
            Next cllWord
            If myCount = cllWordCount Then
              cll.Offset(, 1).Value = 1
              celle.Offset(, 1).Value = MatchNo
              
              'Enable only ONE of the next two lines:
              MatchNo = MatchNo + 1    'enable this one to keep searching for more matches and number them accordingly, 2,3,4 etc.
              'Exit For    'enabling this line instead will stop searching after the first match is found.
            
            End If
          End If
        Next celle
      End If
    End If
  End If
Next cll
End Sub
It will only identify at least 3 matches (one and two word strings are ignored), determined by the line:
If cllWordCount > 2 Then

See also comments in the code.
 
This version adds hyperlinks in column B to link back to the first instance of a match, so where there's a 2, 3, 4 etc. in column B, it will link back to the first match (in column A)
Code:
Sub blah2() 'note it's called blah2.
Set rngToCheck = Range("A2:A35")    'adjust this as required.
rngToCheck.Offset(, 1).Value = "No Duplicate"
rngToCheck.Offset(, 1).Hyperlinks.Delete
For Each cll In rngToCheck.Cells
  If cll.Offset(, 1).Value = "No Duplicate" Then
    MatchNo = 2
    cllWords = Split(Application.Trim(cll.Value))
    cllWordCount = UBound(cllWords) + 1
    If cllWordCount > 2 Then
      Set remainingCells = Nothing
      On Error Resume Next
      Set remainingCells = Intersect(rngToCheck, cll.Offset(1).Resize(rngToCheck.Rows.Count))
      On Error GoTo 0
      If Not remainingCells Is Nothing Then
        For Each celle In remainingCells
          celleWords = Split(Application.Trim(celle.Value))
          celleWordCount = UBound(celleWords) + 1
          If celleWordCount = cllWordCount Then
            myCount = 0
            For Each cllWord In cllWords
              For Each celleWord In celleWords
                If LCase(cllWord) = LCase(celleWord) Then myCount = myCount + 1
              Next celleWord
            Next cllWord
            If myCount = cllWordCount Then
              cll.Offset(, 1).Value = 1
              celle.Offset(, 1).Value = MatchNo
              
              'disable the next line if you don't want hyperlinks:
              ActiveSheet.Hyperlinks.Add Anchor:=celle.Offset(, 1), Address:="", SubAddress:=cll.Address

              'Enable only ONE of the next two lines:
              MatchNo = MatchNo + 1    'enable this one to keep searching for more matches and number them accordingly, 2,3,4 etc.
              'Exit For    'enabling this line instead will stop searching after the first match is found.
            
            End If
          End If
        Next celle
      End If
    End If
  End If
Next cll
End Sub
.
 
This works almost perfectly!! There's 1 issue that comes up where if 2 words are the same it will start adding up, so instead of showing 1 and 2, it will keep going, 3, 4, 5, 6, 10, 18 etc.

I've attached an example.
 

Attachments

  • ExcelGuru9846Book1.xlsm
    19.9 KB · Views: 10
test:
Code:
Sub blah()
Set rngToCheck = Range("A2:A35")    'adjust this as required.
rngToCheck.Offset(, 1).Value = "No Duplicate"
rngToCheck.Offset(, 1).Hyperlinks.Delete
For Each cll In rngToCheck.Cells
  If cll.Offset(, 1).Value = "No Duplicate" Then
    MatchNo = 2
    cllWords = Split(Application.Trim(cll.Value))
    cllWordCount = UBound(cllWords) + 1
    If cllWordCount > 2 Then
      Set remainingCells = Nothing
      On Error Resume Next
      Set remainingCells = Intersect(rngToCheck, cll.Offset(1).Resize(rngToCheck.Rows.Count))
      On Error GoTo 0
      If Not remainingCells Is Nothing Then
        For Each celle In remainingCells
          celleWords = Split(Application.Trim(celle.Value))
          celleWordCount = UBound(celleWords) + 1
          If celleWordCount = cllWordCount Then
            myCount = 0
            cllWordsX = cllWords
            For i = LBound(cllWordsX) To UBound(cllWordsX)
              For j = LBound(celleWords) To UBound(celleWords)
                If Len(cllWordsX(i)) > 0 And Len(cllWordsX(i)) > 0 And LCase(cllWordsX(i)) = LCase(celleWords(j)) Then
                  myCount = myCount + 1
                  cllWordsX(i) = Empty: celleWords(j) = Empty
                  Exit For
                End If
              Next j    'celleWord
            Next i    'cllWord
            If myCount = cllWordCount Then
              cll.Offset(, 1).Value = 1
              celle.Offset(, 1).Value = MatchNo
              
              'disable the next line if you don't want hyperlinks:
              ActiveSheet.Hyperlinks.Add Anchor:=celle.Offset(, 1), Address:="", SubAddress:=cll.Address

              'Enable only ONE of the next two lines:
              MatchNo = MatchNo + 1    'enable this one to keep searching for more matches and number them accordingly, 2,3,4 etc.
              'Exit For    'enabling this line instead will stop searching after the first match is found.
            
            End If
          End If
        Next celle
      End If
    End If
  End If
Next cll
End Sub
 
Back
Top