Recently I’ve been working on automating a business process for a client. It’s become a really interesting project that uses a mix of Excel tables, Power Query, Power Pivot and VBA, as well as a WinAutomation script. Before I talk about the issue that is facing me, it would be helpful to provide a bit of background on what we’re doing, and why so many of the Power BI pieces are needed.
My client has a business in which they outsource employees to other companies. Each of those companies maintains the records of hours, and lets my client download a spreadsheet version of the hours logged by the subcontracted employee. This is fairly helpful, as my client is the one that pays all of the employees, so getting the hours lets him know who gets paid for how many hours. Easy enough, right? But each spreadsheet is in a different format, and that causes a lot of manual entry to try and standardize it into a file that can be uploaded to the payroll processor.
So here we enter Power Query. With Power Query we can import each of the files individually, reformat the data into a consistent set of columns, and output it into a table.
Each pay period we start a new Excel file, and import the data files for that pay period by running a WinAutomation script. The script logs in to the remote systems, passes the correct parameters to the system, and downloads the Excel and CSV files required. It then saves them in a subdirectory of our application under the pay period end date. My Power Query scripts then use the pay period end date, dynamically build a file path to the file and import the data.
It’s beautiful… except…
Here’s where the pain begins… Every time you touch a new data file in Power Query that you haven’t used before you get a security message asking you what type of data (Public, Private or Organizational) you are using. (You can learn more about those here: http://office.microsoft.com/en-us/excel-help/privacy-levels-HA104009800.aspx) The issue I have is that each payroll I create new files in a subdirectory, so Power Query sees them as unique. To that end I can’t just trust the data files once and be done with it.
Now, there is a workaround… just enable the Fast Combine feature. That kills off the warning and lets them go, but it also has an issue. This setting is workbook specific, which means that when my VBA saves the master payroll control file under a new name (we preserve history this way) the setting doesn’t stay selected. Grrr. Given that there is no way to touch this setting via code, my user has to remember to click the Fast Combine button every time they run the update. Is it minor? Sure. But it’s ridiculous, it gets forgotten, and concerns them when they get the permissions messages.
The Solution We Need
Now don’t get me wrong, I’m all for security, but where it makes sense. In Office 2007 we got a new macro paradigm that allowed us to trust folders on our computers/networks. This setting is set on an application level basis, and persists between Excel sessions. Beautiful, as I can set it to a specific folder and forget it. I throw trusted files in there, I don’t throw in the ones I don’t know. It actually allows me to practice safer computing overall.
In my opinion, this setting is drastically lacking in Power Query. I really need the ability to set my Power Query add-in to have trusted root folders with trusted subfolders. This would allow me to trust my application’s data directory and not have to remember to click Fast Combine each time I create new files. It also means that I might pay attention to the message when it does come up in future, as it would be unusual. Currently I see the error so much I just cancel it and go straight to Fast Combine. Hmm… kind of like how we set macro security to “Low” in Excel 2003 and earlier to bypass the annoying message, which essentially left us unprotected.
The Solution (Some of us) Can Implement
To be fair, I believe I do have a workaround for this now, but I don’t think it should be necessary.
Basically what I could do is use VBA to drive the refresh of my tables, and therefore the Power Query scripts. Given that, I could use VBA to copy and replace the files in a central repository where the hierarchy does not change. I.e. I could set up a folder like AppData\Timesheets\Current and use VBA to copy the required data files from my dated subfolders, replacing the ones in the “current” folder. I would then target my Power Query scripts against the files in the “Current” folder, and build my solution off that.
Because Power Query holds a list of trusted files at an application level (rather than workbook level), these files should stay trusted even if I do create a new file, removing the need to constantly click the “Fast Combine” button. So I think this should work.
So what’s the problem? It only involves creating a VBA macro to do that. Fine for me, as I know how to code with VBA and make this happen. But for most of the Power Query target market I would suggest that this is probably outside of their normal skill set.
Even though it is technically possible to work around this issue, I still argue should not be necessary. We need a proper option to trust the files in a local folder so that dynamically referring to a file in Power Query can be scripted without invoking a painful security paradigm.