PDA

View Full Version : Determining textual differences between cells



matt0925
2012-05-17, 04:56 PM
Each week I put together a list of changes to a pre-existing database and then send off those changes to another group. Later in the week when they return the updated file I check to see if they have made all correct changes. These changes generally include adding or removing text from specific cells or repositioning columns.

I am looking for a dynamic way to track if all necessary corrections were made. To be clear, I do not need simple conditional formatting which will tell me which cells are not alike. I know which cells are not alike. I need to be able to say in some form of coding:

"Check to see if cell X is the same as cell Y, minus the highlighted/colored/striked-out text." Or similarly "Check to see if cell X is the same as cell Y, except with the addition of the text that was colored a specific color in cell Y."

Any advice would be greatly appreciated.

Ken Puls
2012-05-18, 06:34 AM
Hi there, and welcome to the forum.

I'm not sure I'm following here... you're looking for a way of tracking what changes were actually made to the file? Is that it?

Bob Phillips
2012-05-18, 05:25 PM
It is not clear how you get the information, is it separate sheets, the same sheet; and whether the two data items contain the checked data or not.

Can you post an example workbook?

matt0925
2012-05-18, 10:01 PM
Thank you for replying. To be more clear, I have a file which is stored on database software that needs to be updated regularly. This file can be either extracted into Word or Excel. I generally extract it into Excel because it is my assumption that an Excel spreadsheet is better at anaylzing data than a Word table, but if based on the following details you disagree please let me know.

Steps:

1. Extracl file into Excel
2. Make updates in Excel file
a. Additional text needed: new text is inserted in red lettering, e.g. The big brown dog went to the park.
b. Too much text: text needed to be deleted is struck out (however, it is my guess that moving forward it will be easier to get Excel to recognize colors than whether or not text is struck out)
c. A cell within a combination of a and b
3. Iteration with updates are handed off to second team
4. Second team manually makes all updates to database software (though this might seem like an extra step updating the software takes a lot of time and thus can be done more effictevely for a number of reasons by second team)
5. After I am informed that all updates have been made, I then download the most current iteration of the file and manually check to see that all necessary updates have been made to the previously stipulated cells.

The problem is that by manually going through -- or even using track changes in Word as a "Cliff Notes," the process still takes several hours to go through 300-400 pages of reports. I need a way to do this more dynamically. Any thoughts?

My current thoughts:

1. Cells in which language has been added: Verification is easy, A1=B1, true or false
2. Cells in which language needs to be removed: Can a macro be developed to recognize the struck out (or potentially colored) wording and then verify if cell in column B is the same as related cell in column A, except without the specific text that needs to be removed
3. Cells in which addition and subtraction must occur, I assume that once the above step is figured out this will be relatively easy to figure out

Ken Puls
2012-05-19, 06:54 AM
Just out of curiosity...

You download data from the database server into an Excel file (into Workbook A)
You make changes to the Excel file (Workbook A)
You send it to somone to make changes in the database (from Workbook A)
You download data from the database server into an Excel file to see if all changes were made (into Workbook B)

Is that right?

If so, why do you need to check if the changes are made? Shouldn't you just be able to compare Workbook A's text to Workbook B's? If they match exactly, then the changes were made. If not, then something is still outstanding, so you could highlight the rows that aren't right.

Or did I miss something?

matt0925
2012-05-20, 02:02 AM
Ken:

No, you are correct. Once updates have been made to the file on the database software I download the latest interation of the updated file into an Excel workbook (workbook B). I then check workbook B against the original workbook (workbook A) that was sent to this other team with all necessary changes.

I'm looking for a dynamic way to check workbook A against workbook B. As I said, for the cells in which text is added there is not problem, because workbook A will match workbook B and if it does not than I know that the correct changes were not made. However, checking cells in which text was supposed to be removed is more difficult as the new cell will obvioulsy be different than the old cell. I need to know that the specific text which was suppose to be removed was in fact removed.

Ken Puls
2012-05-22, 11:50 PM
Ah, I see where you're going with this.

Curious, your download from the database, is it all in a single column? Could you give us a workbook that shows some sample data of raw, correct, re-download data scenarios? Just a dozen records or so, but something that highlights the issues?

matt0925
2012-05-23, 01:05 AM
I download Workbook B and then copy and paste Workbook A along side Workbook B. Generally the majority of corrections are made in 1 column -- so I am basically only concerned about comparing 2 columns (1 from Workbook A and 1 from workbook B).

Currently text that needs to be added is put into Workbook A as red and text that is being removed is formatted as strikethrough. I am not married to either of these methods for identifying text to be added and deleted and would gladly do it differently if there's an easier way.

Please see my attachment as a very basic example of the type of textual modifications to which I am referring.

Ken Puls
2012-05-23, 01:26 AM
I knocked up a quick VBA function for this:


Function CleanText(rng As Range) As String
Dim ch As Long
Dim sTemp As String

For ch = 1 To Len(rng.Text)
If rng.Characters(Start:=ch, Length:=1).Font.Strikethrough = False Then
sTemp = sTemp & Mid(rng.Text, ch, 1)
End If
Next ch
CleanText = sTemp
End Function


You then use the formula in your worksheet as:

=CLEANTEXT(A2)

This will sanitize all strikeouts to give you a revised pattern of what you're looking for. Be aware that trailing spaces should also be struck out as well, as you can see from the example.

I'll also warn you that this will probably slow down large workbooks considerably, but you may be okay with that as it accomplishes your goals.

The code needs to go in a Standard Module (http://www.excelguru.ca/content.php?155-Where-To-Place-VBA-Code) before you apply the formula to your worksheet.

Hope it helps,

matt0925
2012-05-23, 03:45 PM
Wow, thank you. The code works really well. I appreciate your help so much. On a side note, do you recommend any particular resources for teaching oneself VBA?

Ken Puls
2012-05-25, 08:00 AM
Well...

I'm a huge fan of John Walkenback's Power Programming With VBA series. You can find a link to that book (among others) here: http://www.excelguru.ca/content.php?213-Books

If you really want to learn though, you need more than just a book. You can't get better learning that working on your own projects on a forum, and by trying to answer others questions as you start to get more comfortable. With the basics from a book and challenging yourself with real world problems your knowledge will expand very quickly.

Personally, I learned VBA by recording macros first, then found a forum and began asking a LOT of questions, then got John's 2002 Power Programming book, then started trying to give back in forums. Each time I hit a forum my knowled increased exponentially. :)