Load Power Query directly to Power Pivot in Excel 2010

One of the cool features in Excel 2013’s Power Query is being able to load to the Data Model (PowerPivot) directly.  But Excel 2010 doesn’t appear to have this feature.  Interestingly, you can still load Power Query directly to Power Pivot in Excel 2010, it just takes a bit of a careful workaround.

Let’s look at the required steps

Step 1: Create Your Connection

First, I’m going to load in the content of a text file.  So I:

  • Go to Power Query –> From File –> From Text
  • I browsed to the file I needed, and imported it into Power Query
  • I do whatever cleanup is needed and name the query Sales
  • Next, we go to the Home tab –> Close and Load –> Close and Load To…

And here’s the important part:

  • Choose “Only Create Connection” –> Load

And I’ve now got a basic connection to my sales table without landing it in a worksheet:

image

Step 2: Grab the Connection String

Now, here comes the secret.  We need to get the connection string that Excel uses to connect to the Power Query.  Here’s how:

  • Go to the Data tab –> Connections

In there, you’ll see the name of your new connection:

image

  • Select your Query and click Properties
  • Click the Definition tab

Now you’ll be looking at something like this:

image

Notice that this query is actually an OLE DB Query that is simply “SELECT * FROM [Sales]”  That seems easy to work with.  But the key for us is the connection string shown (#2 in the image above).

  • Select the ENTIRE connection string
  • Press CTRL + C to copy it
  • Click Cancel

Note:  Make sure you start at “Provider=” and highlight all the way to the end.  (It’s much longer than what you see in that little box.)

Load Power Query directly to Power Pivot

Finally, we’re going to pull this into Power Pivot.  To do this:

  • Go to the Power Pivot tab –> PowerPivot Window
  • From Other Sources –> Others (OLEDB/ODBC) –> Next

image

  • Name your table
  • Paste your Connection String in the box

image

  • Click Next –> Next –> Finish –> Close

And voila!  We have our Power Query linked directly into Power Pivot in Excel 2010!

image

Just remember… if you do this, NEVER modify this table in Power Pivot.  Always go back to modify the table in the Power Query stage.  Failure to do so could set the table into a non-refreshable state.

Native Database Query Security Prompts

In a comment on my Parameter Tables post, Talha asks why we get prompted with a Native Database Query message each time the value of the parameter used in a SQL query changes:

I had a question. The value I grab from my parameter table, I store it in a variable and use it in my SQL query. However, every time I change the parameter, I get the following prompt:

------------------------------------------------

NATIVE DATABASE QUERY

You are about to run a native database query. Native queries can make changes to the database. Would you like to run the following query?

------------------------------------------------

Is there any way I can make this go away?

To answer this, it helps to understand why this message comes up.

Replicating The Issue

When you connect to a SQL server database in Power Query, there are three parts to the initial dialog:

image

Server and Database are pretty straight forward, but the SQL Statement is marked as optional, and is actually hidden under the arrow at the left.  Now, for anyone who has written SQL in the past, you open up that little window, pop in your SQL and click OK.  And you’ll get prompted with a message about trusting a Native Database Query.

On the other hand… if you DON’T fill in a SQL statement, you’ll be taken to a list of all tables and views in the database that you have access to, and will be able to bring those in to start working with them.  Do that, and you’ll NEVER SEE THIS PROMPT.

Why Prompt About a Native Database Query?

A native database query is basically SQL code that you provide to pass to the database.  Power Query treats this as potentially dangerous, so gives you a warning when you do this.  And every time you change the underlying SQL query (pass one parameter to it to change it in some way), the Query is seen as “new” and again, potentially dangerous.

I tend to agree that this is a bit overkill.  I’m not a SQL expert, by any means, but if the query is just a SELECT query, then I’m not sure why we need to be warned about it.  You’d think that would be easy enough for the PQ team to parse out, looking for keywords like ALTER, DROP, UPDATE, DELETE and such, but regardless, that’s the way it is.

Avoiding the Error

Now that we know what is doing it, how do we avoid it?  The answer is actually more simple, but harder to swallow than you might like. Don’t use the Native Database Query functionality!

I really struggled with this at first.  I saw Power Query as a “SQL Sever Management Studio Light” for Excel pros.  It’s not.  It’s totally different. What the PQ team has built is a way for non-power users to be able to query, clean, restrict and reshape their data.  And if you do it their way, then you won’t get this irritating prompt.

In the case of grabbing data from SQL Server, you’d simply connect to the server, grab the table/view you want, then start setting your filters in Power Query to restrict down the data you want.  The UI makes this easy, and it’s a lot easier to tweak the data to get it right than forcing an Excel person to go back and tweak their SQL.

And here’s the cool part… once you’ve done this, Power Query takes advantage of a technology called “Query Folding”, which folds up all the reshaping commands you’ve issued.  It passes those to the SQL server as a single command to bring back your data.  In layman’s terms (although not perfectly technically correct), you can look at this as if Power Query is allowing you a graphical way to write your SQL for you behind the scenes, then pass it to the server.  And because Power Query knows it won’t do anything dangerous, it doesn’t prompt you about potential problems.

What About Performance?

A few months ago I asked someone at Microsoft this question… the thrust of my question was if Power Query would be more or less efficient than passing an optimal SQL query.  His reply was that no, if someone was a SQL Ninja, they could certainly come up with something that performed better in some situations.  Having said that, for the majority of Excel pros, Power Query will do a better job, as we don’t tend to be SQL optimization experts.

Further to this, the very worst thing you can do for performance is provide a base query then try to do further filtering.  Look at these three cases:

Case 1 Use Power Query to:
Connect to table Transactions
Filter to Dept <> 150
Case 2 Native Database Query:
Select * From Transactions Where Dept <> 150
Case 3 Native Database Query:
Select * From Transactions
Use Power Query to filter:
Dept <> 150

Case 1 is the preferred method when dealing with Power Query.  It will roll both steps up into a single statement, and retrieve the requested set from the server.

Case 2 returns the same data as Case 1, but prompts you with a Native Database Query on each computer you run the solution on.  (You can’t even save the connection with the workbook, it’s a user specific trust flag.)

Case 3 is the worst performing.  The reason is that the initial table has to be brought to Power Query in it’s entirety, and THEN you can start filtering.  Why?  Because Power Query cannot query fold it’s commands into the SQL you provided.

Recommendation

Even though connecting to a table, then re-creating all the filtering steps seems less graceful at first, this is the method I’d recommend you take.  In fact, I’ve actually made it a habit to remove any Native Database Queries from project I wrote in the past, replacing them with the recommended method.

To me this offers two benefits:

  1. No security prompting.  May seem minor, but when you put a Power Query error in front of non-tech users, they tend to freak out.  Why put them through that?
  2. It makes the query easier to follow/tweak when you need to debug it later.  And let’s face it, sooner or later you’ll need to do this.

Use Dynamic Ranges in Power Query

This was a great question that came up in my M is for Data Monkey session at the Amsterdam Excel summit:  Can we use dynamic ranges in Power Query?

If I recall correctly, this was asked by Gašper Kamenšek, after I demonstrated how to use the =Excel.CurrentWorkbook() method to consolidate worksheet tables.  So we gave it a shot to find out.

Setup:

Nick Hodge jumped up and built us a quick little dynamic named range solution that looked like this:

SNAGHTML21f653dd

And defined a named range call “Animals”, as follows:

=OFFSET(Sheet1!$A$4,0,0,COUNTA(Sheet1!$A$4:$A$12),2)

The next step was to pull it into Power Query, but I didn’t have a Table or normal Named Range to pull from, so I went to Power Query –> From Other Sources –> Blank Query.

The Normal Piece

As I’d shown earlier, I typed the following in the formula bar in the Power Query editor:

=Excel.CurrentWorkbook()

And at this point I was pretty chuffed as it looked pretty straight forward:

image

This is exactly what I was hoping I would see.  Fantastic!  So I did a little normal cleanup:

  • Expanded the content
  • Moved the first row to a header
  • Deleted the “Animals” column that was carried down the table

I then proudly announced that “Yes!  It works!”, clicked Close and Load to land it in a table… and it failed:

image

The Data Fun House of Mirrors

That was a bit of a shock.  It worked in the Query window, but when loading it failed.  There’s something not adding up here.  So I jumped back in to edit the query:

I was immediately presented with this:

image

Doesn’t make sense… so I started to step through the process. I clicked on the Source step:

image

Aha!

Use Dynamic Ranges in Power Query

What I forgot was that I’d used =Excel.CurrentWorkbook().  That  function returns a table which contains all tables, connections and named ranges in the workbook.  And when we create a new Power Query, it adds an output table and a connection.  So we basically got a circular reference.

No problem.  We can easily deal with this.  All we need to do is filter down to the Animals table that we need.  (That way we don’t have to worry about any new tables being created.)

So I added that filter immediately after the Source step.  And boom, everything works again because it is restricted to just the table I originally had.  And now, when I commit it the the worksheet, it loads:

SNAGHTML229c0b05

Proof Positive

Perfect.  It loaded, now lets just go add some data and make sure it’s working properly at update.  I added a new record to the table:

SNAGHTML22839637

And voila!

SNAGHTML22a09947

So it looks like we can use dynamic ranges in Power Query without any issues.

Alternate Method

When I was writing this up, I couldn’t replicate the error at first.  The method I used was just slightly different.  Here’s what I did instead:

  • Pulled in the table
  • Clicked the green Table to drill into it
  • Promoted headers
  • Landed it

It worked perfectly at update.  Why?  Because I drilled into a specific table, avoiding the issue at update.  And in all honesty, it’s probably a better method. 😉

Looking for Power Query samples

How would you like to make your data famous?  Or at least – how would you like a solution built for your own data using Power Query… for free?

I have a couple of Power Query related things I’m working on right now, like these:

Now, if you’ve ever been to one of my courses, you’ll know that one of the most important things to me is to use data that is modelled on real world examples.  That’s right, no canned AdventureWorks samples.  I like data that real users face in the real world.

To that end, I’m looking for Power Query samples.  Specifically, for sample files of data to use as examples for my Power Query blog posts, courses and other educational products.  And while I’ve got lots of great data sets of my own, I’m particularly interested in stuff that I haven’t seen or demonstrated on the blog before.

So if you’ve got a nasty file that’s painful to import, or you have a few files that you’re having trouble combining, or you just can’t figure out a starting point with Power Query, I’d like to have a crack at your scenario.  It’s super easy… you email me the file, I’ll take a look at it, and cook up a solution.  And any new or interesting techniques that are needed become the subject of a future blog post, article, or even sample in the courses I teach.

I hope it goes without saying, but for legal reasons I’m going to say it anyway… any files you send me can and will be used in public.  In order to use them in my materials, by sending them to my you are, by default, assigning me copyright to those specific source files.  Because of that, it is VERY important that the data you include is not sensitive.  To be fair, I’m not interested in your data specifics, as much as the form and shape of it, how it starts out, and what the end product should become.

“Okay, that’s great for you, Ken, but what do we get out of sending you our files?”

How about a working solution?  Does that sound reasonable?  You send me your data, and I send you back something that uses Power Query to complete the goal?

If you’re interested (and who wouldn’t be), here’s what you need to include:

  • A sanitized copy of your source data.  (i.e. Do a mass find/replace on employee names/numbers, phone numbers, addresses and such.  If you have particularly troubling data that is interesting, I may be willing to even help here.)
  • A copy of what you want the output to look like.  (I’ll build to that end goal.)
  • A summary of any “rules” you use along the way to determine if something needs to change one way or another.  (I’d rather not spend my time guessing about your end goals.)
  • Your direction as to whether you’d like your/your company name included or withheld should I decide to publish a post using your data.

In return for that, I will send you a copy of the solution once I’ve crafted it.  This may happen days/weeks before any solution is posted publicly, or it may come in the form of an email to let you know that the post is live, at my discretion.

Please be aware that I also reserve the right to NOT use your submission at my discretion.  This could happen if my inbox gets overloaded with requests, if I’ve already dealt with your scenario on the blog, if your data would be better handled in another fashion, if I get too busy with paying projects, if the data is just too difficult to work with, or any other reason that I haven’t listed. (I’ll do my best, but this is free work, so it’s totally at my schedule and discretion.)

If that sounds interesting to you, my email is ken at excelguru dot ca.  I’m going to limit this to the first 10 data sets that land in my inbox, so if you want to try and take advantage of it, act quick!