One of the cool features in Excel 2013’s Power Query is being able to load to the Data Model (PowerPivot) directly. But Excel 2010 doesn’t appear to have this feature. Interestingly, you can still load Power Query directly to Power Pivot in Excel 2010, it just takes a bit of a careful workaround.
Let’s look at the required steps
Step 1: Create Your Connection
First, I’m going to load in the content of a text file. So I:
Go to Power Query –> From File –> From Text
I browsed to the file I needed, and imported it into Power Query
I do whatever cleanup is needed and name the query Sales
Next, we go to the Home tab –> Close and Load –> Close and Load To…
And here’s the important part:
Choose “Only Create Connection” –> Load
And I’ve now got a basic connection to my sales table without landing it in a worksheet:
Step 2: Grab the Connection String
Now, here comes the secret. We need to get the connection string that Excel uses to connect to the Power Query. Here’s how:
Go to the Data tab –> Connections
In there, you’ll see the name of your new connection:
Select your Query and click Properties
Click the Definition tab
Now you’ll be looking at something like this:
Notice that this query is actually an OLE DB Query that is simply “SELECT * FROM [Sales]” That seems easy to work with. But the key for us is the connection string shown (#2 in the image above).
Select the ENTIRE connection string
Press CTRL + C to copy it
Note: Make sure you start at “Provider=” and highlight all the way to the end. (It’s much longer than what you see in that little box.)
Load Power Query directly to Power Pivot
Finally, we’re going to pull this into Power Pivot. To do this:
Go to the Power Pivot tab –> PowerPivot Window
From Other Sources –> Others (OLEDB/ODBC) –> Next
Name your table
Paste your Connection String in the box
Click Next –> Next –> Finish –> Close
And voila! We have our Power Query linked directly into Power Pivot in Excel 2010!
Just remember… if you do this, NEVER modify this table in Power Pivot. Always go back to modify the table in the Power Query stage. Failure to do so could set the table into a non-refreshable state.
In a comment on my Parameter Tables post, Talha asks why we get prompted with a Native Database Query message each time the value of the parameter used in a SQL query changes:
I had a question. The value I grab from my parameter table, I store it in a variable and use it in my SQL query. However, every time I change the parameter, I get the following prompt:
NATIVE DATABASE QUERY
You are about to run a native database query. Native queries can make changes to the database. Would you like to run the following query?
Is there any way I can make this go away?
To answer this, it helps to understand why this message comes up.
Replicating The Issue
When you connect to a SQL server database in Power Query, there are three parts to the initial dialog:
Server and Database are pretty straight forward, but the SQL Statement is marked as optional, and is actually hidden under the arrow at the left. Now, for anyone who has written SQL in the past, you open up that little window, pop in your SQL and click OK. And you’ll get prompted with a message about trusting a Native Database Query.
On the other hand… if you DON’T fill in a SQL statement, you’ll be taken to a list of all tables and views in the database that you have access to, and will be able to bring those in to start working with them. Do that, and you’ll NEVER SEE THIS PROMPT.
Why Prompt About a Native Database Query?
A native database query is basically SQL code that you provide to pass to the database. Power Query treats this as potentially dangerous, so gives you a warning when you do this. And every time you change the underlying SQL query (pass one parameter to it to change it in some way), the Query is seen as “new” and again, potentially dangerous.
I tend to agree that this is a bit overkill. I’m not a SQL expert, by any means, but if the query is just a SELECT query, then I’m not sure why we need to be warned about it. You’d think that would be easy enough for the PQ team to parse out, looking for keywords like ALTER, DROP, UPDATE, DELETE and such, but regardless, that’s the way it is.
Avoiding the Error
Now that we know what is doing it, how do we avoid it? The answer is actually more simple, but harder to swallow than you might like. Don’t use the Native Database Query functionality!
I really struggled with this at first. I saw Power Query as a “SQL Sever Management Studio Light” for Excel pros. It’s not. It’s totally different. What the PQ team has built is a way for non-power users to be able to query, clean, restrict and reshape their data. And if you do it their way, then you won’t get this irritating prompt.
In the case of grabbing data from SQL Server, you’d simply connect to the server, grab the table/view you want, then start setting your filters in Power Query to restrict down the data you want. The UI makes this easy, and it’s a lot easier to tweak the data to get it right than forcing an Excel person to go back and tweak their SQL.
And here’s the cool part… once you’ve done this, Power Query takes advantage of a technology called “Query Folding”, which folds up all the reshaping commands you’ve issued. It passes those to the SQL server as a single command to bring back your data. In layman’s terms (although not perfectly technically correct), you can look at this as if Power Query is allowing you a graphical way to write your SQL for you behind the scenes, then pass it to the server. And because Power Query knows it won’t do anything dangerous, it doesn’t prompt you about potential problems.
What About Performance?
A few months ago I asked someone at Microsoft this question… the thrust of my question was if Power Query would be more or less efficient than passing an optimal SQL query. His reply was that no, if someone was a SQL Ninja, they could certainly come up with something that performed better in some situations. Having said that, for the majority of Excel pros, Power Query will do a better job, as we don’t tend to be SQL optimization experts.
Further to this, the very worst thing you can do for performance is provide a base query then try to do further filtering. Look at these three cases:
Use Power Query to:
Connect to table Transactions
Filter to Dept <> 150
Native Database Query:
Select * From Transactions Where Dept <> 150
Native Database Query:
Select * From Transactions Use Power Query to filter:
Dept <> 150
Case 1 is the preferred method when dealing with Power Query. It will roll both steps up into a single statement, and retrieve the requested set from the server.
Case 2 returns the same data as Case 1, but prompts you with a Native Database Query on each computer you run the solution on. (You can’t even save the connection with the workbook, it’s a user specific trust flag.)
Case 3 is the worst performing. The reason is that the initial table has to be brought to Power Query in it’s entirety, and THEN you can start filtering. Why? Because Power Query cannot query fold it’s commands into the SQL you provided.
Even though connecting to a table, then re-creating all the filtering steps seems less graceful at first, this is the method I’d recommend you take. In fact, I’ve actually made it a habit to remove any Native Database Queries from project I wrote in the past, replacing them with the recommended method.
To me this offers two benefits:
No security prompting. May seem minor, but when you put a Power Query error in front of non-tech users, they tend to freak out. Why put them through that?
It makes the query easier to follow/tweak when you need to debug it later. And let’s face it, sooner or later you’ll need to do this.
The next step was to pull it into Power Query, but I didn’t have a Table or normal Named Range to pull from, so I went to Power Query –> From Other Sources –> Blank Query.
The Normal Piece
As I’d shown earlier, I typed the following in the formula bar in the Power Query editor:
And at this point I was pretty chuffed as it looked pretty straight forward:
This is exactly what I was hoping I would see. Fantastic! So I did a little normal cleanup:
Expanded the content
Moved the first row to a header
Deleted the “Animals” column that was carried down the table
I then proudly announced that “Yes! It works!”, clicked Close and Load to land it in a table… and it failed:
The Data Fun House of Mirrors
That was a bit of a shock. It worked in the Query window, but when loading it failed. There’s something not adding up here. So I jumped back in to edit the query:
I was immediately presented with this:
Doesn’t make sense… so I started to step through the process. I clicked on the Source step:
Use Dynamic Ranges in Power Query
What I forgot was that I’d used =Excel.CurrentWorkbook(). That function returns a table which contains all tables, connections and named ranges in the workbook. And when we create a new Power Query, it adds an output table and a connection. So we basically got a circular reference.
No problem. We can easily deal with this. All we need to do is filter down to the Animals table that we need. (That way we don’t have to worry about any new tables being created.)
So I added that filter immediately after the Source step. And boom, everything works again because it is restricted to just the table I originally had. And now, when I commit it the the worksheet, it loads:
Perfect. It loaded, now lets just go add some data and make sure it’s working properly at update. I added a new record to the table:
So it looks like we can use dynamic ranges in Power Query without any issues.
When I was writing this up, I couldn’t replicate the error at first. The method I used was just slightly different. Here’s what I did instead:
Pulled in the table
Clicked the green Table to drill into it
It worked perfectly at update. Why? Because I drilled into a specific table, avoiding the issue at update. And in all honesty, it’s probably a better method. 😉
Something else cool that we’ll be announcing VERY soon
Now, if you’ve ever been to one of my courses, you’ll know that one of the most important things to me is to use data that is modelled on real world examples. That’s right, no canned AdventureWorks samples. I like data that real users face in the real world.
To that end, I’m looking for Power Query samples. Specifically, for sample files of data to use as examples for my Power Query blog posts, courses and other educational products. And while I’ve got lots of great data sets of my own, I’m particularly interested in stuff that I haven’t seen or demonstrated on the blog before.
So if you’ve got a nasty file that’s painful to import, or you have a few files that you’re having trouble combining, or you just can’t figure out a starting point with Power Query, I’d like to have a crack at your scenario. It’s super easy… you email me the file, I’ll take a look at it, and cook up a solution. And any new or interesting techniques that are needed become the subject of a future blog post, article, or even sample in the courses I teach.
I hope it goes without saying, but for legal reasons I’m going to say it anyway… any files you send me can and will be used in public. In order to use them in my materials, by sending them to my you are, by default, assigning me copyright to those specific source files. Because of that, it is VERY important that the data you include is not sensitive. To be fair, I’m not interested in your data specifics, as much as the form and shape of it, how it starts out, and what the end product should become.
“Okay, that’s great for you, Ken, but what do we get out of sending you our files?”
How about a working solution? Does that sound reasonable? You send me your data, and I send you back something that uses Power Query to complete the goal?
If you’re interested (and who wouldn’t be), here’s what you need to include:
A sanitized copy of your source data. (i.e. Do a mass find/replace on employee names/numbers, phone numbers, addresses and such. If you have particularly troubling data that is interesting, I may be willing to even help here.)
A copy of what you want the output to look like. (I’ll build to that end goal.)
A summary of any “rules” you use along the way to determine if something needs to change one way or another. (I’d rather not spend my time guessing about your end goals.)
Your direction as to whether you’d like your/your company name included or withheld should I decide to publish a post using your data.
In return for that, I will send you a copy of the solution once I’ve crafted it. This may happen days/weeks before any solution is posted publicly, or it may come in the form of an email to let you know that the post is live, at my discretion.
Please be aware that I also reserve the right to NOT use your submission at my discretion. This could happen if my inbox gets overloaded with requests, if I’ve already dealt with your scenario on the blog, if your data would be better handled in another fashion, if I get too busy with paying projects, if the data is just too difficult to work with, or any other reason that I haven’t listed. (I’ll do my best, but this is free work, so it’s totally at my schedule and discretion.)
If that sounds interesting to you, my email is ken at excelguru dot ca. I’m going to limit this to the first 10 data sets that land in my inbox, so if you want to try and take advantage of it, act quick!