# Thread: Formula to find similar values and report difference in years

1. ## Formula to find similar values and report difference in years

Hey everyone,

I hope you can help me out. Here is the problem:

If you look at the excel file you can see a lot of countries and years starting from 1960 till 2008. The third column indicates the intensity of computer use per country per year. It is reasonable that a lot of data is missing in the first decade(s), simply because the computer is something relatively knew. In the complete dataset more technologies are present and these may be available from the 1960s onwards (for example telephones).

Using this data I want to calculate the years lag between the United States and country X, where country X is every single other country stated in the dataset. This lag can be explained as the difference in time between a countrys technology usage intensity (so country X) and the last time the United States had a similar technology usage intensity.

To give an example:
When looking at Austria, the year 1998 and the computer usage intensity (see cell C138 = 0.23342137), it can be obtained that the United States had a similar computer usage intensity in 1991 (see cell C4054 = 0.233614376). The formula then reports that the lag is 7 years (=1998  1991).

Of course it might be the case that for some countries and years there cannot be an answer.

To illustrate that I will give another example:
When looking at Argentina, the years 1988 and the computer usage intensity (see cell C30 = 0.004405223), it can be obtained that the United States had NOT a similar usage intensity in the history. The United States had in 1981 a computer usage intensity of 0.009218756, but this is much higher than the computer usage intensity of Argentina in 1988. So the formula should report either an error or just state a dot as to indicate it cannot be calculated. This illustrates directly that the computer usage intensity of country X for a given year should be larger than or equal to (>=) the computer usage intensity of the USA in some year.

If you have any questions, please state them below!

I hope you all have an idea what the formula should look like.  Reply With Quote

2. Assuming you enter a tolerance level (+/-) in a separate cell for what an acceptable difference in intensity (say 0.001 in K1), then try:

=IF(C2=".",".",IF(MAX(ABS(IF(IF(ISNUMBER(\$C\$4023:\$C\$4071),\$C\$4023:\$C\$4071-C2<=\$K\$1),\$C\$4023:\$C\$4071)))=0,".",IFERROR(ABS(B2-INDEX(\$B\$4023:\$B\$4071,MATCH(MAX(ABS(IF(IF(ISNUMBER(\$C\$4023:\$C\$4071),\$C\$4023:\$C\$4071-C2<=\$K\$1),\$C\$4023:\$C\$4071))),ABS(IF(IF(ISNUMBER(\$C\$4023:\$C\$4071),\$C\$4023:\$C\$4071-C2<=\$K\$1),\$C\$4023:\$C\$4071)),0))),".")))

confirmed with CTRL+SHIFT+ENTER not just ENTER, then copied down to just before the US list.  Reply With Quote

It is however the case when I insert the formula into the excel file (cell C2) using CTRL + SHIFT + ENTER; it gives an error.

Given the size of the formula and my knowlegde of the formula, I couldn't locate the problem of it. Could it be related to the first part of your formula, namely:
=IF(C2=".",".",  Reply With Quote

4. What kind of error?

Perhaps in your version of Excel, you might need to replace all the commas that separate the parameters with semicolons?

=IF(C2=".";".";IFERROR(ABS(B2-INDEX(\$B\$4023:\$B\$4071;MATCH(MAX(ABS(IF(IF(ISNUMBER(\$C\$4023:\$C\$4071);\$C\$4023:\$C\$4071-C2<=\$K\$1);\$C\$4023:\$C\$4071)));ABS(IF(IF(ISNUMBER(\$C\$4023:\$C\$4071);\$C\$4023:\$C\$4071-C2<=\$K\$1);\$C\$4023:\$C\$4071));0)));"."))  Reply With Quote

5. Yes, that was the problem.

I checked your formula and it works good! There are however some (small) problems. I will again illustrate them with an example.

The first problem affects many countries, I will illustrate it using Zimbabwe:
In Zimbabwe 1990 - 1996, the computer usage intensities where always lower than anything the USA has ever reached.
0.007258294 (Zimbabwe 1996) < 0.009218756 (USA 1981)
There is no data available for the USA before 1981. Therefore the formula should either report an error or just a dot to illustrate it cannot be calculated.
If we look at Zimbabwe and the year 1997 compared to the USA;
0.00989763 (Zimbabwe 1997) > 0.009218756 (USA 1981)
So from this we can say that the lag is 16 years. (your formula explains this correctly!)

The second problem affects only a small number of countries, I will illustrate it using Switzerland:
In Switzerland in the year 2000, the computer usage intensity is 0.651577161.
0.651577161 (Switzerland 2000) > 0.624629444 (USA 2001)
As you can see it should be the case that the lag is negative! Switzerland had more computers in the year 2000 than the USA. So the lag should not be 1, but -1. Probably because of the ABS that the function is somewhat incorrect.
The same applies to the year 2001 of Switzerland, this is obtained in the USA in 2002. So the lag should again be -1.

Another minor issue, but that is not really a problem as I can fix it manually:
In Switzerland in the year 2002, it cannot be compared anymore to the USA. As Switzerland in 2001 was already higher than the USA ever reached.

Thanks you for helping me out. You already made the formula better!  Reply With Quote

6. The first problem. My formula does already return a dot for the Zimbabwe 1990-1996 years.

The Second problem. My formula should be adjusted, as you correctly guessed, to exlude the ABS around the subtraction function.

The Third problem. I am not sure I understand... but if you can correct manually, maybe better, as it sounds it might be complicated with formula.  Reply With Quote