How to help with slow Power Query for queries that reference other queries?

Scotty81

New member
Joined
Sep 20, 2018
Messages
2
Reaction score
0
Points
0
Excel Version(s)
Excel 2016
The scenario: I've got some small data sources (a few thousand rows) as Excel tables inside a workbook where I'm performing some Power Query math. Specifically, I read the data in PQ from the tables, augment the data with some meta data, aggregate the results, and then load it to a pivot table (regular, not power). But, I also do further calcs, lookups and aggregations, and do this a couple of times, since folks need to see the data at different levels of granularity.

As I've added the layers, model performance has slowed considerably. In particular, when I open PQ and click on the last query in the stack, it takes about 4 minutes to load. Sprinkling Table.Buffer commands in the M code doesn't help; in fact, it slows down the updates. Since I have a lot of machinations to perform, on one hand, it makes sense for me to keep those machinations in PQ. On the other hand, if performance is intolerable, then I'd be forced to implement those machinations in Power Pivot - even though I'm making changes (table merges, and other calcs) at the row level.

I haven't found any Power BI books that address this issue. Are there any recommended sources with some practical and specific advice?

Thanks in advance.
 
Hi Scotty,

Unfortunately, performance tuning is more of an art than a science in Power Query today, but there are some things that you can do to possibly help.

  1. If Table.Buffer() doesn't help you, then pull it. This only helps in scenarios where you are reloading data, as it incurs extra overhead to cache. So if you're using it in the wrong place, it will hurt performance.
  2. Edit your privacy settings to disable privacy for the workbook. I've seen this cut load times by as much as 30%

After that, you have to get into looking at the queries to see what's going on and how you are approaching things.

Transposing large tables can be a brutal performance hit, for example. When we teach how to unpivot subcategories in our Power Query Academy, we teach people to split things up to only transpose the required rows, and skip the rest of the table.

To be honest, we'd need to see the M code and sample data to see if there is anything that pops out as obvious culprits here.
 
Hi Ken,

Thanks for your quick reply and comments on the Table.Buffer command. When you say, "disable privacy for the workbook", can you let me know specifically which privacy setting you are referring to? Within the Trust Center Settings, there were quite a few potential candidates. I'm not sure if you meant to "Disable all data connections" since my raw data is inside my Excel workbook.

I'll also see what I can port over from Power Query into Power Pivot. There are some measures that I currently calculate in Power Query that I can do in Power Pivot. A lot of what I'm doing in this model is loading some data - let's say some electricity costs, and then splitting the cost among several departments in a 1:many scenario. Once I get department costs, I then roll up them to the division level. Unfortunately, many departments are split into multiple divisions in a 1:many relationship as well. And, then I roll up again into business entities in a 1:many relationship again. That's actually not my model at all, but you get the idea that I'm starting with granular data, and doing several rounds of rollups. I've implemented a test data set with Power Query, and technically it works great, except for the slow performance.

Although I can calculate my measures e.g. sum of utility costs easily in Power Pivot instead of Power Query, I'll look to find a way to implement my 1:many calculations in Power Pivot too. I achieve this via an outer join in Power Query, but as far as I know, Power Pivot just doesn't do that. My general concern is that I'll need to flip back and forth between Power Query and Power Pivot as perform the various machinations, and I don't know if Power Pivot works like that either. If none of that makes sense, I can post a very simple Power Query model that does the initial cost allocations so you can at least see the underlying operations, in the hopes that there is a way to implement this in Power Pivot instead.

Thanks,
Scotty81
 
Back
Top