Power Query - Long running query using ODBC Connection

PaulieThePolarBear

New member
Joined
Sep 14, 2018
Messages
2
Reaction score
0
Points
0
Excel Version(s)
Office 365
I have a table in a SQL Server database with 431 million records. I want to apply a simple filter to 3 columns that will result in approx 100 records being returned.

I've tried this 3 ways

1. Using From Database > From SQL Server.

a. Entering the value in Server and Database and clicking OK
b. Selecting my table from the Navigator window and clicking Edit
c. In the Power Query window, applying the filter to the 3 columns and then loading to an Excel sheet

2. Using From Other Sources > From ODBC

a. Selecting the name of my DSN in the Data Source Name (DSN) box and entering the SQL Statement "SELECT * FROM MyTable WHERE A = ? and B = ?? and C = ???".
b. Loading my results to an Excel sheet

3. Using From Other Sources > From ODBC

a. Selecting the name of my DSN in the Data Source Name (DSN) box and click OK
b. Selecting my table from the Navigator window and clicking Edit
c. In the Power Query window, applying the filter to the 3 columns and then loading to an Excel sheet

Using methods 1 and 2, my query refreshes in a matter of seconds. Using method 3, when I load the query, the Queries & Connections pane shows XX rows from dsn = YY. XX slowly increases and gets to about 3 million records in about 30 seconds and starts to chew up memory and I end up cancelling the refresh as who has time for that!!! :tongue: I've never left it to finish so don't know if it will ever load or I'll end up running out of memory and Excel will crash.

When I look at the final step in Power Query for methods 1, I get the View Native Query option.
When I do the same thing for method 3, I don't get this option. My understanding of Query Folding means that, in simplified terms, all the leg work to get my final results is now being done on my machine, rather than on the server.

Is the slowness on method 3 a known issue with Power Query, ODBC and large tables? Or is this a pointer to a potentially bad driver? Is there a workaround to make this run in a similar time to method 1 or 2?

I'm using Office 365 version 1808 32-bit Excel.
 
Hi Paulie,

Nothing in your statements surprises me at all, if that helps answer your question. (And yes, your understanding of query folding is correct.)

The technical explanation is that ODBC drivers are a "one sized fits all" driver, and should only be used when there is no other way to connect to the data source. I've seen ODBC against SQL server break query folding in a filter step, which is not what you want (and is what is happening to you.)

You're pulling from SQL Server, so use the SQL Server driver. It's optimized for SQL Server and will support many more query folding options than ODBC drivers will.

#1 is your optimal solution. It is UI driven and, as you add new steps, will continue to fold.
#2 is not advised. It works today, but as soon as you provide a custom query, it breaks query folding. This means that if you apply other steps using the PQ interface, those transformations will be processed locally. In order to have the server do it, you'd need to manually modify the SQL query.
#3 yeah... you've already found the pitfall.
 
Back
Top