Highest total function - a work in progress

kitsu_ne

New member
Joined
Jan 15, 2014
Messages
16
Reaction score
0
Points
0
Location
http://www.khanacademy.org/profile/killervulpix/#p
I'm trying to create a function that I can use to return a top total in a range (that is specified). There are more (somewhat complicated) steps that I will have to work through after this, but for now, if I can get passed this part I think I should be good to go.

Here's what I'm imagining:

I have a set of data like the following:
Month
Hours
Jan
1
Feb
2
Mar
3
Jan
4
Feb
5
Mar
6
Jan
7
Feb
8
Mar
9

So the idea is to create an array of unique values in the months column, as well as a summation of the corresponding hours.

Function Driver(sum As Range, driverRng As Range, val As Integer, Optional title As Boolean)

so, "sum" is the column that I want to total, "driverRng" is the months column that I want condensed,
"val" is which value I want to return (1 being the higest value, 10 being the tenth's hightest and so on),
and lastly "title" if specified as True would return the actual driver, and not the summation. So for instance True in the example above would return "Mar" while False would return "18"

Here's what I've put together.
Sadly, the numbers are not even totalled correctlt.

Also, I still need to figure out how to sort the values so that it returns the top number, which I have no idea how to do at the moment.

Code:
Function Driver(sum As Range, driverRng As Range, val As Integer, Optional title As Boolean)
    Dim drvList As New Collection       ' This will contain a list of all the values in driverRange, but will not duplicate.
    Dim ttlList As New Collection       ' This will hold the final list of data.
    Dim drvSum As Double                ' This will allow us to keep track of the summation for each item in drvList.
    Dim ret As Variant                  ' This will be the end value to return.
    Dim i As Long                       ' i will keep track of the instance we are working with in loops.
    Dim d As Double                     ' d will hold a temp total of each driver in a loop.
    On Error Resume Next
    
    For Each a In driverRng
        drvList.Add a, a                ' arr.Add "value", "key" - Since we can only have 1 key of each item, duplicates are not added to the table.
    Next

    
    For Each itm In drvList             ' For each unique driver that was found...
        i = 0                           ' Set the count, and the total to 0.
        d = 0
        For Each cell In driverRng      ' Then loop through all the cells in the driver's range...
            If itm = cell Then          ' If the cell is the same as the item we're working on...
                d = d + sum(i)          ' grab the corresponding number value and add it to the total for the item.
            End If
            i = i + 1                   ' Increase the instance count as we cycle through the cells so we know what number to add.
        Next cell
        ttlList.Add d, itm              ' Once all cells have been cycled add the total to the total's collection, and then move on to the next item.
    Next itm
    If title = True Then Driver = drvList(val)  ' Once all drivers have been cycled through we check what value is wanted and return the driver title if True...
    If title = False Then Driver = ttlList(val) ' Or the total if False, or not specified.
End Function

Currently, this returns the following:

numberFALSETRUEShould be:
17Feb15
213Jan12
316Mar18

If anyone has any suggestions please let me know,

Thanks.
~Kitsu
 
Last edited:
let save time, attach please a sample file
 
It won't let me edit my previous message, but I wanted to add that I noticed that I accidentally created two separate veriables to track the summation totals. This doesn't have anything to do with why the function is not working properly but I wanted to mention it.

Ignore this line entirely:
Code:
    Dim drvSum As Double                ' This will allow us to keep track of the summation for each item in drvList.
 
I actually figured out what was causing the numbers to not match up = since I was setting i to 0, thinking that the collections started at 0, it was not adding the propper numbers together. When I set i to 1 all the numbers matched up perfectly.

But now the problem that I have now is, how can I sort both ttlList and drvList so that the values match up? I need to be able to sort these values by highest to lowest total.

I've attached the updated file.
 

Attachments

  • DriverFunction.xlsm
    17.4 KB · Views: 7
You can get the same results you're getting very quickly indeed without a udf by using a pivot table. See attached which gives you summarised and sorted data. You can then get the top value, either by looking at the topmost value or you can dispense with sorting and only display the highest value(s) (I did this in the 2 rightmost pivots in the attached).
If for some reason this isn't going to be good enough, I'll add a sorting routine to your function.

ps.
You're adding ranges to the drvList collection; it might be less resource hungry just to add values.

If you have the vb.Net framework files installed you may be able to use either a Systems.Collection.ArrayList which has built-in sorting or a Systems.Collection.SortedList.

Consider also converting the result of your function to a complete 2D array whci you could then array-enter into a range on your sheet. This would also be less resource hungry as the function would only be executed once to get all the results in the cells, instead of once per cell that the function is in.

I'll wait for you to come back before I do any coding/tweaking.
 

Attachments

  • ExcelGuru2913DriverFunction.xlsm
    26.6 KB · Views: 14
Updated

The function that I am trying to build is a little more complicated that I had previously mentioned. I updated my spreadsheet to include more of what I'm trying to include, as well as a modified snippet of the data that it would be iterating through. I updated the function as well, and I hope it's easy to understand.

I had been using pivot tables in the past, but as you can see in the attachement, I need to pull data by a date range. In my main spreadsheet I have visual basic modify the contents of the 10 pivot tables by going through every instance of the date and making hidden the dates that do not fit the range, or making visible the ones that do. Because there can be a lot of records by the end of the month, half way through the month (I need a monthly range and a weekly range, which is why there are 10 tables) it takes the file quite some time to go through each table and update them. Because of this, and because the function would be used throughout the file, I'm trying to get away from using pivot tables.

What you describe sounds promising, but I've never looked into that before. I only just found out about creating collections while I was trying to figure this out.

Thanks for your time,
Kitsu
 

Attachments

  • DriverFunction.xlsm
    27.4 KB · Views: 10
I am all for speeding the function up and using less resources! That's why I'm trying to do this in the first place! I looked to see if I have the vb.Net framework using the steps described here: http://msdn.microsoft.com/en-us/library/hh925568(v=vs.110).aspx
The "FULL" folder exists but there is no Release value, so I guess that means I do not have 4.5 or newer? My folder looks exactly like the snapshot there, just without the Release value/file.
 
OK, I'm aware of time passing, so here is a possible solution that might be better than what you have but it is very much WORK IN PROGRESS.
It uses a dictionary object to get unique drivers and sum their hours at the same time. This is then converted into a 2-D array and sorted.
Dictionaries are not the fastest tool in the box, and the bubble sort I used is one of the slower sorting algorithms.

I'll continue to explore:
built-in collections (probems with outputting the keys)
Systems.Collection.ArrayList
Systems.Collection.SortedList
 

Attachments

  • ExcelGuru2913DriverFunction DictionarySoln.xlsm
    35.5 KB · Views: 12
Very cool, I like how you can do that! I don't entirely understand what's going on though.

I tried to recreate your while loop. I had written a similar "for" loop, but this is my first attempt at using a while loop. It looks like it works, but for some reason it seems to skip "0" values for some reason... Did I do something wrong?

Also, one thing that I do want, however, is to be able to specify the number of the driver that I want because sometimes I'll only want one of them, but it won't necesarrily be the top most value.

I'm still interested in what you are/were working on. I've read a little about dictionaries a year ago but I didn't undertand as much as I do now, so it will be helpful to see what you came up with.


I really apreciate you taking a look, p45cal, thank you!
 

Attachments

  • DriverFunction.xlsm
    38.8 KB · Views: 7
I'd like to know if the dictionary solution is going to be fast enough before coding with other methods.

Looking at your Driver Function as in your last attachment:
On Error Resume Next is a killer while developing because it hides where things go wrong.
There are times when you must use it, even while developing and here you must use it, but only while trying to add items to a collection. So you must switch it off again when you no longer need it with On Error Goto 0.
If statements with just one statement between the If and End If may as well be on a single line.
With the drvList collection the first item is drvlist(1), not drvlist(0), so your:
ReDim arr(0 To drvList.Count, 0 To 1)
should be:
ReDim arr(1 To drvList.Count, 0 To 1)

Likewise, instances of:
For i = 0 To drvList.Count
should be:
For i = 1 To drvList.Count

I see you have two such loops; why not have just one loop and after assigning a driver to arr(1,0), immediately tot up his hours and assign to arr(i,1) as well, before assigning the next driver?

In your totting up hours section you have a variable d to hold a running total and assign it to arr(i,1) at the end. I don't think you need it; you can use the arr(i,1) directly.

Why are you setting drivers with 0 hours to Null? This is probably the source of your "it seems to skip "0" values for some reason".
Also, in the sort routine, Null seems to ignored rather than treated as less-than or greater-than a number (the comparison returns Null rather then True or False).
I've shortend the bit of code which makes values Null but have commented it out (identifiable by double comment apostrophes in the code).

In your Sort2D function you have instances of:
For i = 0 To UBound(arr, 1) - 1
which should be:
For i = 1 To UBound(arr, 1) - 1
or:
For i = LBound(arr) To UBound(arr) - 1
(here we're using LBound instead of 1 (also we don't need the (arr,1), (arr) will do, the 1 is implicit).

Now your sort routine doesn't compare the right things, so stuff gets moved around a lot but no real sorting is done!!

For a sort of this kind you should always compare adjacent values.
If you run along the whole array once, comparing adjacent values, you end up with the smallest value at the bottom (since you're always sorting descending) which really never needs to be compared to anything again, nor moved.
The second time you run along the array, you do the same and when that run is complete, the second to last lowest value is in its final place.
Now the inside loop could be arranged to get shorter at each iteration of the outer loop, but neither you nor I have bothered, so there are some redundant comparisons. With small numbers of drivers this isn't going to slow us down too much, so we'll leave it be.
One thing that can be checked though, is if after any one run along the array, no swaps are carried out at all, then the list must be fully sorted and we can cut short the process. This is where blnNoSwaps comes into play; you've got the setting and testing of this variable in the wrong places. It needs to be set to True at the start of the internal loop, set to False if a swap happens, and tested at the end of the internal loop.

So that's you sorted (oops, I didn't mean to be punny).

Finally, as far as coding goes, the two lines:
If title = True Then Driver = arr(val, 0)
If title = False Then Driver = arr(val, 1)
can be shortened to one line:
If title Then Driver = arr(val, 0) else Driver = arr(val, 1)
(that one not quite in that form in the attached)
or more obscurely, to:
Driver = arr(val, title + 1)

This Driver function is not efficient because every cell it's in means the whole array needs to be calculated. You can see a significant delay in the values updating if you change any value in column C (the hours). That delay will get a lot longer if you have a longer table, especially if with more drivers.

Regarding "to be able to specify the number of the driver that I want", you wouldn't want to ue the same routine to extract this information, a formula would suffice.

See cells Q8 and R8 (R8 is reportedly more efficient)
Q8 is:
=SUMPRODUCT(--($E$2:$E$90=""&P8),--($D$2:$D$90=$K$3),--($B$2:$B$90>=$H$1),--($B$2:$B$90<=$H$2),$C$2:$C$90)

or you could shorten that to:
=SUMPRODUCT(($E$2:$E$90=""&P8)*($D$2:$D$90=$K$3)*($B$2:$B$90>=$H$1)*($B$2:$B$90<=$H$2),$C$2:$C$90)

R8 is:
=SUMIFS($C$2:$C$90,$D$2:$D$90,$K$3,$B$2:$B$90,">=" & $H$1,$B$2:$B$90,"<=" & $H$2,$E$2:$E$90,P8)

If you wanted to have your own user defined function then you could have the formula in cell S8:
=OneDriver($C$2:$C$90,$D$2:$D$90,K3,$B$2:$B$90,$H$1,$H$2,$E$2:$E$90,P8)

supported by the code:
Code:
Function OneDriver(HoursRng, StatusRng, status, DateRng, sdate, edate, driverRng, ADriver)
OneDriver = Application.WorksheetFunction.SumIfs(HoursRng, StatusRng, status, DateRng, ">=" & sdate, DateRng, "<=" & edate, driverRng, ADriver)
End Function

I've made most of the above changes in the attached file.
 

Attachments

  • ExcelGuru2913DriverFunctionMsg10.xlsm
    38.8 KB · Views: 16
Last edited:
Thanks for all your help, I really do appreciate it.

The reason why I can't use the SUMPRODUCT formula is that I won't know the actual driver that is needed for the cell. I was looking for a way to specify if I wanted the 1st highest, second highest, third, and so on, but these values would not necessarily be in the same range together. For example:
Cell1 “The second highest driver is:” Cell2 (Driver title) Cell3 “With” & (Driver Hours) & “hours.”
The second highest driver is:
422
with 56.0 hours.

Then the rest of the worksheet would have information about that driver.
I can see why it's not efficient, having 10 drivers, two statuses (NM/NS) and being used once for the title, and once for the value, not to mention who knows how many records it's going to be going through by the end of the month.

I won't be able to use this function in my Daily product; I'll have to stick to pivot tables unfortunately. However, for my monthly drivers spreadsheet that uses a data connections to pull its info, I should be able to use this function to quickly total up the top 3 system drivers, and the top 3 subsystem drivers for each of those more efficiently.

Thank you for teaching me a few things I didn't know before, and clarifying things I had misunderstood! Also, to clarify a few of my earlier mistakes:
Why are you setting drivers with 0 hours to Null? This is probably the source of your "it seems to skip "0" values for some reason".
Also, in the sort routine, Null seems to ignored rather than treated as less-than or greater-than a number (the comparison returns Null rather then True or False).
I've shortend the bit of code which makes values Null but have commented it out (identifiable by double comment apostrophes in the code).
I was trying to remove the driver if it had no hours so that it wouldn’t be a distraction. It’s not wanted. I tried to make these titles/values null thinking that the sorting would push them to the bottom of the array, and then they would just be blank values. I realized that it was a little tricky to delete a value from the middle of an array so I was trying to find something.

Code:
        For i = 0 To UBound(arr, 1) - 1
            For j = 1 To UBound(arr, 1)
I thought with arrays the first most value was 0, while 1 was used for collections, is this wrong?
By typing “For i = 0 To UBound(arr, 1) – 1” I was telling the program to cycle through the first, through the second to last values.
By typing “For j = 1 To UBound(arr, 1)” I was telling the program to cycle through the second, through the last values. This way it would start off as comparing the first to the second, all the way to the first through the last values, and cycle through repeatedly until it finished with the second to last value being compared to the last value.

This way, “If arr(j, 1) > arr(i, 1) Then” was comparing arr(0,1) to arr(1,1) and cycling through the list.

This is the only thing that’s tripping me up at the moment, everything else you did I understand why it works the way you described.

~Kitsu
 
The reason why I can't use the SUMPRODUCT formula is that I won't know the actual driver that is needed for the cell. I was looking for a way to specify if I wanted the 1st highest, second highest, third, and so on, but these values would not necessarily be in the same range together. For example:
Cell1 “The second highest driver is:” Cell2 (Driver title) Cell3 “With” & (Driver Hours) & “hours.”
The second highest driver is:422with 56.0 hours.
You're quite right, I didn't read properly (even though I quoted it!)



I won't be able to use this function in my Daily product; I'll have to stick to pivot tables unfortunately.
Have you speed-tested (a loosely used term) my dictionary solution offered in msg#9? I haven't, but I think it's faster. I may get round to doing some tests myself. I would like to hear how you get on with it.



I was trying to remove the driver if it had no hours so that it wouldn’t be a distraction. It’s not wanted. I tried to make these titles/values null thinking that the sorting would push them to the bottom of the array, and then they would just be blank values. I realized that it was a little tricky to delete a value from the middle of an array so I was trying to find something.
You could make the titles Null if you wanted not to see a list of titles with 0 hours. You could even make the hours Null too, but only after sorting.



I thought with arrays the first most value was 0, while 1 was used for collections, is this wrong?
This is not always true, it depends on a few things. Collections I think always start with Item(1). Arrays you can specify the base. From a developer's point of view, if you use both, as we are here, it is easier to try to keep the indices so that the same index corresponds to related items in both. This saves us having to remember that it's 1 different; mmm… so is drvlist(i+1) which corresponds with arr(i) or the other way around? Since the collection is based 1, make the array based 1 too, then this helps eliminate mistakes in coding.
In your sample file, the count of drvlist was 16. There were 16 elements. When you redim arr(0 to drvlist.count you're creating 17 members, 0 to 16 rather than 1 to 16.

Regarding arrays in general, by default when you Dim arr(4) you get an array arr(0 to 4). If you have Option Base 1 at the top of the code module you get arr(1 to 4).
You can also create an array with the likes of myArray = Range("A1:C3") you get myArray(1 to 3, 1 to 3) a 2-dimensional array, both 1-based, regardless of any Option Base statement. Single columns and single rows also yield a 2D array (not 1D as you might expect):
myArray = Range("A1:A3") yields myArray(1 to 3, 1 to 1)
myArray = Range("A1:C1") yields myArray(1 to 1, 1 to 3)



By typing “For i = 0 To UBound(arr, 1) – 1” I was telling the program to cycle through the first, through the second to last values.
By typing “For j = 1 To UBound(arr, 1)” I was telling the program to cycle through the second, through the last values. This way it would start off as comparing the first to the second, all the way to the first through the last values, and cycle through repeatedly until it finished with the second to last value being compared to the last value.

This way, “If arr(j, 1) > arr(i, 1) Then” was comparing arr(0,1) to arr(1,1) and cycling through the list.
Right at the beginning of the loops you are! but it soon goes wrong. I added a line to your Sort2D routine:
Debug.Print "Is member " & j & " less than member " & i & "?"
just before where the comparison oisactually made. I got this in the Immediate pane:
Is member 1 less than member 0?
Is member 2 less than member 0?
Is member 3 less than member 0?
… (snipped)
Is member 15 less than member 0?
Is member 16 less than member 0?
Is member 1 less than member 1?
Is member 2 less than member 1?
Is member 3 less than member 1?
… (snipped)
Is member 15 less than member 1?
Is member 16 less than member 1?
Is member 1 less than member 2?
Is member 2 less than member 2?
Is member 3 less than member 2?
… (snipped)
Is member 15 less than member 2?
Is member 16 less than member 2?
Is member 1 less than member 3?
Is member 2 less than member 3?
Is member 3 less than member 3?
… (snipped)
Is member 15 less than member 3?
Is member 16 less than member 3?
Is member 1 less than member 4?
Is member 2 less than member 4?

I've highlighted 2 lines in red.
Now let's say that we went in with a list that was already sorted, so no swaps would be needed at all. One of the red lines has to be true (if they're not equal), so at least one swap would take place. I've just picked out 2 example lines, this'll be happening for all the other pairs of members.

(I did an experiment; I went in to your Sort2D with 10,9,8,7,6,5,4,3,2,1,0 being an already sorted list and came out with 0,10,9,8,7,6,5,4,3,1,2 and it having made 27 swaps.)
 
Last edited:
Back
Top