Must be a more efficient way....

Jesse

New member
Joined
Mar 22, 2011
Messages
51
Reaction score
0
Points
0
I have some line level invoice data I'm working with. Each row is one line on an invoice, an invoice can have many items or just one.

I'm trying to flag each row if the whole invoice contains are particular type of item so I can filter when I put it into a pivot table.

Currently I'm using:

=IF(G1<>G2,IF(COUNTIF(OFFSET(T2,0,0,COUNTIF(G:G,G2)),"Fish")>0,TRUE,FALSE),D1)

To check to see if the invoice contains "Fish". The data is sorted by invoice so I first check the invoice number, if it's the same as the item above then just copy that.

The problem is my data has 120,000 rows, with four conditions to check a calculation takes upwards of 15 minutes on my Quad core PC.

Any suggestions for making this formula more efficient? I only need to flag TRUE or FALSE.

Jesse
 
Perhaps:

=SUMPRODUCT(--($G$2:$G$5=G2),--ISNUMBER(--($T$2:$T$5={"fish";"hippo"})))>0

I don't think offset is the way to go because it is a volatile function. You could use INDEX with a union operator to achieve the same, although I'm not sure you are going to get noticable benefits. It is improtant however to avoid OFFSET because all of these formula will calculate when recalculation is invoked, regardless of whether or not a precedent has changed.
 
I'll upload a sample tonight, I have to scrub it pretty thoroughly.

Jesse
 
You could use some SQL (via microsoft query or via VBA) to pull the data from the source file and then either return just the subset you need directly to a pivot table, or alternately use microsoft query or VBA SQL to add a new column that contains the 'hook' you need. SQL is probably the way to go on such a large dataset. When you post your sample data I can have a look at it, if this sounds like an approach you'd consider.

Using MS Query and SQL in excel is covered very well in an amazing blog by Craig Hatmaker. Craig's blog starts off with looking at how to use simple things like Microsoft Query (bundled with excel) to get data, then progressively teaches you more and more every post until you're using excel to add record to the database and much much more besides. There's some amazing ideas on how to implement a table driven approach to all this, so you don't have to write SQL but can do it all on a worksheet.
Check out his first entry at http://itknowledgeexchange.techtarget.com/beyond-excel/forward/ and then go forward from there. Most comprehensive explanation on excel as database front end I've ever seen.
 
View attachment Sample data.xlsx

Thanks Jeff, I'll definitely start there. Pulling less data would be ideal. I don't have access to pull the data myself yet but that should be rectified shortly.

Sample is attached.

Jesse
 
Checking your example i can't see your formula working that well, however, much simpler, assuming you have your word to look for in N1 use this and copy down as needed:
=IF(ISERROR(FIND($N$1,I2,1)),FALSE,TRUE)
 
Hi Jesse

You sample doesn't marry up with your explained structure, however I believe this should work, and that it should be considerably quicker than the method that you are using:

=SUM(COUNTIF(INDEX(I:I,MATCH(A2,A:A,0),0):INDEX(I:I,MATCH(A2,A:A,1),0),"*"&{"fish","dog","Hippo","aardvark"}&"*"))>0

Sure it's quite lengthy but I INDEX and COUNTIF are known for their ability to yield fast results...
 
Hi Jesse

I'm not sure exactly what you wanted to see in your Pivot, but the attached uses Jon's formula, and one of my own.

They return slightly different results when used in a Pivot Table, but maybe one or both of them will help you to solve your problem.

=AND((A2<>A1)+COUNTIF(I2,"*"&Pivot!$B$1&"*"))*COUNTIF(I2,"*"&Pivot!$B$1&"*")

Jon's formula (modified)
=SUM(COUNTIF(INDEX(I:I,MATCH(A2,A:A,0),0):INDEX(I:I,MATCH(A2,A:A,1),0),"*"&Pivot!$B$1&"*"))>0

I created a Pivot Table, and put a dropdown in cell B1 with various items in a DV list. There is some event code on the Pivot sheet, which refreshes the PT after the selection has been changed.
 

Attachments

  • Sample data (2).xlsm
    24.5 KB · Views: 12
Hi again, Jesse. Here's a workbook using your sample data that dynamically refreshes an SQL query feeding a traditional excel table as well as a pivot table. I've tested it in excel 2010, and hopefully it also works in 2003 and 2007 (but likely not 2002).

You may need to save it to your computer before it works...I'm not sure if it will work straight after downloading it (but I suspect it will).

You can list as many search terms as you want in column N, and the table (on the left) and the pivot table (on the right) should refresh accordingly to only return orders that include the text from your search term. The search term is currently case sensitive, meaning 'fish' will result in hits for 'Angel fish' but not for 'Fish food' (unless the order also includes 'fish' in lower text on another line).

This is a work in progress, and the way it is set up it queries your data source in one sheet, and returns your desired results to another sheet.

Note that it's generally considered bad practice to use SQL to query the same open workbook like I am here, because historically Excel has suffered something called 'memory leak'. See the comments at http://www.excelhero.com/blog/2010/07/excel-partial-match-database-lookup.html for a discussion on this, but basically if you run this code too many times your computer will ultimately crash because the code uses memory that (due to a bug in excel 2003 or lower) is not released back to the computer after the code has finished.

However, note that if you are using excel 2007 or 2010, then it looks like the memory leak bug has been fixed. And if you amend the code to grab data from another workbook or database entirely, then there will be no issues.

There's ways around the memory leak issue, as dealt with in the blog link I posted just above, but I haven't (yet) implemented it in this code,...largely because as I say above, it doesn't seem to be an issue in excel 2007 or greater.

Feel free to come back to me with any comments. Note that I'm still fairly new to VBA, so this code could likely be improved on.
 

Attachments

  • Sample data 6.xlsm
    36.2 KB · Views: 20
Thanks for the suggestions, I'll try them and post my results.

Thanks for the SQL link Jeffrey, definitely an area I intend to move into next.

Jesse
 
Back
Top