Clean WhiteSpace in PowerQuery

The other day as I was working through a model, I once again tripped upon the fact that Power Query’s Text.Trim function doesn’t clean whitespace inside the text string, only at the ends.  For those who are used to Excel’s TRIM function, this is a frustrating inconsistency.

Just to circle on it, here’s the difference:

Source Function Result
Excel =TRIM(“  trim   me  “) “trim me”
Power Query =Text.Trim(“  trim   me  “) “trim   me“

Typically, I’ve just gone through the cycle of replacing a double space with a single space a few times on the same column to deal with this issue.  The issue, of course, is that you need to do this twice if there are 4 spaces, but add more spaces, and you have to do this more times.  Doesn’t seem like a really robust solution.

At any rate, this time I emailed one of my friends on the Power Query team and suggested that they should implement a function to make this a bit easier.

My Suggestion for a Clean Whitespace Function

The gist of my suggestion was to create a new function that would not only trim the whitespace internally, but would also allow you to specify which character you want to clear out.  This way it would work nicely to clean whitespace in the shape of spaces (the usual culprit in my world), but would also allow you to substitute in other characters if needed.  (Maybe you need to replace all instances of repeating 0’s with a single 0.)

It got referred to another friend on the team, (who wishes to remain nameless,) and he responded with some pretty cool code.  I’ve taken that code, broken it down and modified it a bit, and the end result is a slightly different version that can work the same as Excel’s TRIM() function, but adds an optional parameter to make it even more robust.  For lack of a better name, I’m going to call it “PowerTrim”.  (Just trying to do my part to keep the Power in Power Query!) 😉

Here’s the function:

(text as text, optional char_to_trim as text) =>
char = if char_to_trim = null then " " else char_to_trim,
split = Text.Split(text, char),
removeblanks = List.Select(split, each _ <> ""),
result=Text.Combine(removeblanks, char)

And to implement it, you’d take the following steps:

  • Copy the code above
  • Create a new query –> From Other Sources –> Blank Query
  • Change the query name to PowerTrim
  • Go into the Advanced Editor
  • Select all the text and replace it with the code above –> Done

Like this:


How it Works

We’d call this from a custom column, feeding in a column of text, and specifying the character (or even string of characters) we’d like to trim.  The function then works through the following process:

  • It checks to see if the char_to_trim was provided, and uses a space if not
  • It splits the text by that character, resulting in a list:


(This list shows the word “bookkeeper” split by “e”)

It then:

  • Filters out any blank rows
  • Combines the remaining items using the original character to split by

(The original version was actually all rolled up in one line, but I find it easier to debug, step through, examine and play with when it’s separated.)


Here’s some examples of the function in action. I started with a raw table from Excel.  (Create a new query –> From Table)


And added a Custom column by going to Add Column –> Add Custom Column

  • Name:  Trim_null
  • Formula:  =PowerTrim([Text])


Notice that in the first row it trimmed the leading, trailing and internal spaces.  Just like Excel!  (Remember that if you used Power Query’s default Text.Trim() function, you would return “trim   me”, not “trim me”.)

Now, let’s add another and try with an alternate character… like 0.  Again, we go to Add Column –> Add Custom Column:

  • Name:  Trim_0
  • Formula:  =PowerTrim([Text],”0”)


In this case the extraneous zeroes are trimmed out of row 3, leaving only a single one.  Cool stuff.  Now what about the “e”. Let’s see how that one goes.

Once more to Add Column –> Add Custom Column:

  • Name:  Trim_0
  • Formula:  =PowerTrim([Text],”e”)


The first time I looked at this, I thought there was an issue with the function.  But then I remembered in this case we are removing all leading and trailing e’s, as well as replacing any duplicate e’s with a single e.  You can see that this is indeed what happened in both rows 2 and 4.

Final Thoughts

I wish there was a way to get this to easily role into the Text functions category, so that I could call it something like Text.PowerTrim() or even replace the Text.Trim() function with my own.  Unfortunately a query name can’t contain the period character, which kind of sucks.  I guess it’s to to protect you from accidentally overwriting a function, but I’d like the ability to do it intentionally.

Allocate Units Based on Dates Using Power Query

I ran into an interesting wrinkle in a model I’m building, where I need to allocate units based on dates.  The idea here is to allow a user to the number of units to allocate, the start date and the end date.  From there, I wanted to use Power Query to work out how many months have elapsed, and then tell me how many units should be allocated to each year in the period.


Here’s a look at my data (which you can download here):


So the idea here is that I need to come up with a table that shows that data should be allocated as follows:

10-1-2015 8-53-05 AM

So, if we look at the Traditional Single Family, the sales cycle is the 6 months from Aug 2015 through Jan 2016.  With the first 5 months being in 2015 and the final month being in 2016, that means we need to allocate 5/6 of the total units to 2015 and 1/6 to 2016.

Allocate Units Based on Date: Method

My initial thought was to try and find a date difference or duration type function to return a count of months between two dates.  Unfortunately, such a function doesn’t seem to exist.  For that reason, I decided I’d just go ahead and build my own function to do the job.

Step 1: Create a function to return a list of months

To start with, I needed a list of month end dates.  I started a blank query, jumped into the Advanced Editor and built a simple query to provide a hard coded startdate and enddate, then create a list from one to the other:

Source = {Number.From(startdate)..Number.From(enddate)}

That list yielded me a list of date serial numbers, so I then:

  • Went to Transform –> To Table
  • Changed the column’s data type to Date
  • Renamed the column to Date
  • Converted the column to month end dates (Transform –> Date –> Month End)
  • Removed Duplicates (Home –> Remove Duplicates)

The end result is a short table that shows only the month end dates:


Step 2:  Add a Year End date column

I then needed to find a way to count the number of months in each year.  To do that I:

  • Added a year end column (Select the Date column –> Add Column –> Date –> Year –> End of Year)
  • Went to Transform –> Group By and set up the grouping as follows:
    • Group by:  EndOfYear
    • New column name: Months_x_Year
    • Opeartion:  CountRows


Step 3: Modify to list Months in Period

At this point I realized that I had a pretty serious miss in my logic.  If I wanted to apply this as a proportion, I needed to also track the total amount of months in the period (so that I could allocate 5/6 to 2015 and 1/6 to 2016.)

To fix this, I added another level of grouping, but with a twist…

  • I removed the “Group By” column
  • I created an “Original” column, and set the operation to All Rows
  • I created a “Months_Total” column, set to SUM the Months_x_Year column

Here’s the configuration:


And the result:


This is pretty slick, as the grouping returned the total count of months, but also returned the original table.  Of course, when you expand the table using the double headed arrow to the top right of the Original column, it runs the Months_Total row down each row that gets added:


Step 4:  Turn the routine into a function

The next step was to go back into the Advanced Editor, and turn this into a function. That’s actually not hard at all, requiring only three lines to be modified.  The first 4 lines of the function are shown here:

(startdate as date, enddate as date) as table =>

As you can see, I basically added the parameter line at the beginning (using the same variable names for startdate and enddate), then commented out the lines I initially used in order to populate the data I used to build my test case.

Finally, I renamed the function to fnGetAllocationBase, and saved it.

Step 5:  Using the function

To use the function, we basically now just load the original table, then feed the start/end dates in to it.  Here’s how I went through that process:

  • Select the table –> Power Query –> From Table
  • Select the First Month and Last Month columns –> Change Type –> Date
  • Add Column –> Add Custom Column
    • Formula:  =fnGetAllocationBase([First Month],[Last Month])

I now had a new column containing the tables I needed with my allocation basis:


As I didn’t need month granularity for my model, (we’re budgeting on an annual basis,) I’m now able to:

  • Remove the First Month and Last Month columns
  • Expand the columns from the Custom column
  • Add a new custom column with the following details:
    • Name:  Units
    • Formula:  =[Units To Allocate]*[Months_x_Year]/[Months_Total]
  • Remove the Units to Allocate, Months_x_Year and Months_Total columns
  • Set my data types

And the end result is a nice table that will serve my sales model nicely:


Renewed as an MVP

I’m pleased to say that I received the email this morning to let me know that I’ve been renewed as an MVP for another year.


This is my 10th Microsoft Excel MVP award, and there have been a lot of changes in that time.  Although I do provide my opinions during my technical articles, typically I don’t often create blog posts that are solely introspective or opinion pieces.   With this being a milestone for me, I thought I’d do one today.

Some of the highlights since I’ve been an MVP:

  • We’ve seen four versions of (Windows) Excel since I was awarded:  Excel 2007, 2010, 2013 and now 2016.
  • Excel online pioneered as a “free for everyone service” (which you can access at
  • Office 365 was released, and Microsoft is pushing hard to make this the new model for how you purchase Office in future.
  • Excel for iOS and Android hit the scene.  (iPhone was in it’s infancy back in 2005.)
  • Power Pivot was released – first as a free download for Excel 2010 (Microsoft did free?), then a second free iteration for Excel 2010, before finally becoming integrated into Excel 2013 and up.
  • Power Query was released, again as a free download for Excel 2010 AND 2013 (not only the current, but the past version?), experiencing MONTHLY updates (rather than major versions only) before being integrated into Excel 2016.
  • Power Query is continuing to get updates even after Excel 2016’s release, meaning that we’ve got upgrades continuing to be provided for 3 versions of Excel for the first time ever.

If you’d have asked me to predict any of that back in October 2005, there’s no way that I could have.  Whether you agree with them all or not, these changes have been revolutionary for both business intelligence in Excel, as well as the ability to connect from anywhere to get your work done.  (Or feel guilty about not getting it done!)

Of course, not all of these have been without flaws or problems.  These are just my opinions, but…

  • The pioneering of the Ribbon in Excel 2007 without a way to customize it was a disaster that should never have happened.  Microsoft was so focused on “their” new user interface that they didn’t even want to hear that it was less efficient and punished power users efficiency.  So much so that I wrote a book on it.  Thankfully, the product teams listened eventually, and we saw that change in Office 2010.
  • Office 365 has been met with confusion and angst. I think there’s been a couple of reasons for this.  The first is that they’re pushing a new subscription model, conflicting with over 30 years of Microsoft’s habitual model where we were able to buy once and “own” the product.  The second is that they’ve done a poor job of explaining that the Office 365 service is actually a bundle of products in many cases, not JUST your Office license.  I use Office 365 to host my email and even have a Sharepoint site.  There is no way that I could afford to run and maintain my own email/Sharepoint servers, and last time I checked - based on a 3 year update cycle – the cost was about the same to subscribe vs re-buy anyway.  Overall, I think the subscription is actually the right move, it’s just going to take time for people to get used to and embrace it.  It’s this model that will allow Microsoft to justify giving us constant and frequent updates.
  • Re-branding of the “Power Add-ins” was a huge mistake.  When Power Pivot, Power Query, Power View and Power Map were released, I met MANY high end Excel pros who celebrated that Microsoft had FINALLY recognized their role, and was releasing Power Tools aimed at them.  (For reference, I know the Excel team DOES recognize the value of power users, this was simply the perception of those I met.)  Now we’re seeing these tools re-branded down to “less intimidating” names: Power Query in Excel 2016 has basically had it’s identity removed, Power Map is now 3D maps, and Power View has been hidden away without a default place in the Ribbon.  I can’t say how much it frustrates me that there is a constant need to dumb down the interface and pretend that Excel is easy.  It’s not, and the team at Microsoft that controls the branding needs to recognize this.  The Excel pros that drive the majority of business intelligence need both the tools and the respect for what they do.  There is a vast amount of the program that is perfectly accessible to the 80% who use 20% of the program.  Let’s put some focus on the 20% that drive the majority of business intelligence world wide.

Now, having said all that, I want to point out that overall I’m really happy with where Excel has gone.  The Power products were real game changes for me.  So much so that I quit my day job in May to focus on ExcelGuru Consulting Inc, primarily on providing training services to help people get the most out of Excel.  We’ve got some great Excel courses available at GoSkills, and of course our awesome Power Query Training Workshop which will blow your mind.

Over the past 10 years I’ve learned that the Excel team is not just a faceless group of folks working in isolation from the real world.  I was very much of this opinion the first time I made a trip to Redmond… (I was newly minted, and we’d just been given the Ribbon in Excel 2007, so that will give you an idea of my mindset.)  In fact, as I’ve learned over the past 10 years, the Excel team are actually a collection of SUPER passionate people who LOVE Excel, and want it to be the best it can be.  I have a huge respect for all of them.  I count myself very fortunate to be in a place where I get to go meet with them annually and provide feedback, much of which (combined with that of my MVP colleagues) has influenced many different features for the better.

One last thing that makes this program so amazing is that the MVP award gives many of my colleagues a reason to gather in one place annually.  I’ll never forget walking into my first summit and being introduced to the Excel MVPs that had made the trip… the names of legends that I’d been learning from.  And the welcome from them as they calmly just pointed out that hero-worship wasn’t allowed, as I was just one of them.  So odd, humbling and empowering.  Over the past 10 years I’ve made a lot of friends whom I’ve met in Redmond at the summit, and been able to fly around the world to meet even more.  It’s been just fantastic. The summit is the highlight of my year, for the simple reason that I get to go and hang out with people who truly understand me.

I have no idea where the next 10 years will take us, but I’m excited to see it unfold.

Sharing Power Query Solutions

One of the attendees at today’s Power Query Workshop asked about a specific scenario today.  His company doesn’t have a solid IT infrastructure for sharing data, and he’s trying to figure out the best method for sharing Power Query solutions with a colleague so that he’s not the only one with the power to update.  Let’s take a look at this and explore a solution.


The scenario is based around high level soccer (football) players.  The National Team coach wants a report every week on the status of the current National Team players. For instance he wants to know things like - have they been injured, how many training sessions they’ve done, their training data (GPS, Heart rates etc), and other wellness information.

Of course, the players play in different clubs, each of which has an analyst who records this information into their team database on a daily basis.  Our client then uses Power Query to aggregate the data from these different team databases with the national database, to push out their dashboard.

Current Solution Architecture

Because each team is independent, they manage their own player databases.  Our client set up a system where those teams upload their databases to a DropBox folder on a weekly basis so that he can get access to their databases.

He then installed DropBox on his local PC, allowing him to sync those databases inside the corporate firewall.  The reason this is important is that he now has “local” access to their databases, as well as access to his own corporate database.  Visually, the system looks like this:


The Issue

Now this is great, all works well except for one issue… with the exception of the corporate database, all the databases needed in order to update the solution are stored in our client’s personal DropBox folder, and he uses a hard coded path to grab the data.  Using DropBox, that path is something like this:


Why is that an issue?  Because he wants to go on vacation, and the dashboard needs to be updated while he’s gone.  Short of sharing his password, how can he fix this?


We need to figure out a way to have the file path dynamically updated to the correct username.  But the problem is that there isn’t an Excel function that returns the Windows username used in the DropBox file path.  Power Query also can’t query the Windows username.

But guess what?  Several years ago I just happened to publish a VBA routine to work out the Windows username.  You can find that article right here.

Sharing Power Query Solutions

Solution Adjustment Overview

Here’s the overall steps we’ll need to do in order to adjust his solution:

  • Invite his colleague to the DropBox folder
  • Make sure his colleague installs the DropBox app on his work PC so that the data syncs
  • Add the VBA function listed above to the dashboard
  • Implement a Parameter table to dynamically change the path to the DropBox folder

We’ll take it for granted that he’s taken care of the first two steps

Adding the VBA Function

The first step we need to do is add the VBA code.  That’s fairly straight forward, we just need to follow these steps:

  • Press Alt+F11 to open the Visual Basic Editor
  • Locate the solution workbook in the Project Explore
  • Right click the solution name –> Insert –> Module
  • Paste in the code from the article as listed here.  (If the module you created has the line Option Explicit at the top already, then avoid copying that line from the article.)
  • Close the visual basic editor

We can test that the function worked by entering the following in an Excel cell:


Add the Parameter table

Based on the steps listed in this blog post, we need to:

  • Create a new table parameter table as follows:


The final column here isn’t required for the end solution, but shows how I set things up on my side to test this out:

  • The VBA function returns my username
  • The DropBox Root shows the root folder to my DropBox location, dynamically created with my username
  • The DropBox folder is a test folder I set up for this purpose (but would be replaced with the name/path to the shared folder)
  • The Filename is a simple file I set up for my test

Of course, we shouldn’t forget to copy in the fnGetParameter function as listed in the article either, as this is pretty important to make the solution work.

Adjust the queries to use dynamic paths

Now, I didn’t have the full set of logic from our client, so I cooked up a simple example to emulate this.  Basically I just connected to the text file in my test folder, so my M code looks like this:

Source = Csv.Document(File.Contents("C:\Users\Ken\Dropbox\Test\Test.txt"),null,"",null,1252)

At this point, I simply started substituting in variables into the code using the fnGetParameter function:

dropboxroot = fnGetParameter("Dropbox Root"),
dropboxpath = fnGetParameter("Dropbox Folder"),
filename = fnGetParameter("Filename"),
fullfilepath = dropboxroot & dropboxpath & filename,

Source = Csv.Document(File.Contents(fullfilepath),null,"",null,1252)

And at this point, the query is dynamic.  When I save my workbook it will refresh on his side, providing he’s saved the text.txt file in his DropBox\Test folder.

Adjusting this for Databases

Our client isn’t actually retrieving data from a text file, but that doesn’t really matter.  In order to make this work, for each of his queries, he needs to do three things:

First, he needs to copy the first four lines after the let statement above into his query.

Second, he needs to update the filename to refer to the name of the database he’s connecting to:

filename = “FootballClubOne.Accdb”,

Third, he needs to locate the hard coded file path in his query and replace it with the fullfilepath variable.

And that’s it.  At that point, the file path is dynamically set.  Providing his co-worker has set up DropBox, the solution should refresh for him as well.

Split Power Queries

There was a really cool new feature added in the latest Power Query update:  The ability to split Power Queries.  This is something that has always been possible by editing the M code manually, but this makes it super simple.

Where this is useful

Where can this be super helpful is when you’ve built a nice query to reshape your data.  It runs fine for a bit, then you realize that you need to create a larger data model out of the data.  For example, assume we have this data:


And we run it through these steps:


To come up with this:


All of this is fairly easy as Power Query goes, but now the model needs to grow.  In order to expand it, we also want to create a table of unique Inventory Items and a table of unique Sales people.  Basically we want to de-aggregate the data that should have come in from separate tables in the first place.

Methods to Split Power Queries

Like always, there are a variety of ways to do this.  You could create new queries to load the data from the original table, then cut it down to the columns needed in each case.  But that causes and extra load requirement.

You could manually cut the code up to the step required, create a new blank query, then reference the new query from the previous.  But that takes some knowhow and tinkering that many people won’t be comfortable with.

Starting in Power Query version 2.26 (released today), we have a MUCH easier way.  Let’s assume that we want to split this query right after the Changed Type step, so that we can create an Items table and a Saleperson table in addition to the Transactions query that we already have.

How to Split Power Queries – the easy way

To start, we need to understand the steps and what they give us.  We can step through each step of the query, and find the step that gives us the jumping off point we need.  In my case, this is the Changed Type step.  We then right click the step AFTER Change Type, and choose Extract Previous:


You’ll be prompted to enter a name (I’ll use “Base Load”), and upon clicking OK, you’ll see a couple of things happen:

  • A Base Load query is created
  • The Queries Navigator opens on the left, showing you’ve now got multiple queries
  • The Transactions query (the one I’ve been working on) gets shorter
  • The Transactions query’s Source Step gets updated to #”Base Load”



You can see these changes here:


So the Transactions query still contains all the subsequent steps, but the Source step changed, and the Changed Type step is now in the Base Load query:


The biggest piece of this whole thing, however, is that the Base Load query is still pointing to the raw source table, but the Transactions query now loads from Base Load, NOT the original data source.  So it’s following the staging/loading approach I talk about in this blog post.

Now, how can we use this…?

Making Use of the Split Power Queries

So far, very little is different to the overall goal, except that we load in two stages.  Let’s change that by creating a new query that references the Base Load query:

  • Right click the Base Load query in the Query Navigator (at left)
  • Choose Reference
  • Change the query name to Salespeople
  • Right click the Sold By column –> Remove Other Columns
  • Select the Sold By column –> Home –> Remove Duplicates
  • Right click the Sold By column –> Rename –> SalesPerson

And we’ve now got a nice query that shows our unique list of sales people:


Now let’s build the SalesItems table:

  • Right click the Base Load query in the Query Navigator
  • Choose Reference
  • Change the query name to SalesItems
  • Right click the Inventory Item column –> Remove Other Columns
  • Select the Inventory Item column –> Home –> Remove Duplicates
  • Right click the Inventory Item column –> Rename –> SalesItem

And this table is done now as well:


Loading the Split Power Queries to the Data Model

The final step is to load these to the Data Model.  We’ve actually created three new queries in this session, but we don’t get the liberty of choosing a new destination for one of them.  Instead, we get to choose a single loading style that will be applied to ALL of them.  (If in doubt, I’d suggest that you load queries as Connection Only first, then change them after if you need to pick different destinations.  This will save you waiting while Power Query litters your workbook with extra worksheets and loads the data to them.)

For our purposes here, I’ll load them directly to the Data Model:

  • Home –> Close & Load To…
    • Select Only Create Connection
    • Select Add to the Data Model
    • Click Load

The only side effect here is that that the Base Load query was also loaded to the data model, and I don’t need that.  So I’ll now quickly change that.

  • Go to the Workbook Queries pane –> right click Base Load –> Load To…
  • Uncheck “Add this data to the Data Model” –> Load

And I’ve now got my tables where I need them so that I can relate them and build my solution.


Final Thoughts

This is a fantastic feature, and I was actually going to blog on how to do this the old way, until they surprised me with this update.  I’m a huge fan of the staging/loading approach, and this will certainly make it easier to retrofit a query after it’s already been built.

Performing OR logic tests

When working with Power Query, it’s actually pretty simple to restrict records in a table using an AND criteria.  As filters on different columns are applied in a cumulative fashion, we just keep applying more to filter down the data.  But what about performing OR logic tests?

The issue

Assume we’ve got a list that looks like this:


And we want to preserve any record where the inventory item is Talkative Parrot OR the item was sold by Fred.  So how would you do that?

We’d be tempted to start by filtering the Inventory Item column to just the Talkative Parrot item.  But that would remove our very first row, which shows an item sold by Fred.  And if we started by filtering the Sold By column to Fred, we’d lose the Talkative Parrot in row 2.

Performing OR logic tests – via merging columns

We could merge the Inventory Item column together with the Sold By column, then filter for any item that contains Talkative Parrot or Fred.  While this might work in this situation, it could also lead to cross contamination if there were names like Manfred or Wilfred, or if the name we were searching for appeared in one of our Inventory items (like “Fred the dog”.)

Performing OR logic tests – via list functions

Using list functions is a far better way to make this kind of comparison. In fact, there is a function that is specifically built for this purpose, which works exactly like Excel’s OR() function; List.AnyTrue().

To use this, we’ll set up a new custom column with the following formula:

{[Inventory Item]="Talkative Parrot",

[Sold By]="Fred"}

The formula breaks down as follows:

  • The function name is List.AnyTrue()
  • The function requires a list, which means that you need the {} inside the parenthesis.
  • Each list item (comparison), is nested inside the curly braces, and is separated by commas

Knowing this, we then just need to include the comparisons we want to make, with the field name surrounded in square brackets, and the comparison surrounded in quotes (as we are making a text comparison here.)

And the result, as with Excel’s OR() function, is a list of TRUE/FALSE values:


Of course, we can do better than this, by wrapping the original function in an if statement:

=if List.AnyTrue(
{[Inventory Item]="Talkative Parrot",[Sold By]="Fred"}
then "Meets Criteria!"
else "No Match"

Which returns useful messages instead of just TRUE/FALSE values:


Naturally, you could filter these records or keep them, as your need dictates.

Performing AND logic tests

If you’re simply looking to filter out records based on an AND criteria, then applying successive filters will work.  But what if you only want to tag the records, so that you can use this field in a slicer?  In that case you need to preserve both the matching and non-matching items.

Fortunately, Power Query has a list function for this as well:  =List.AllTrue().

This function is created in the same manner as List.AnyTrue(), where the following would only show a “Meets Criteria!” message where the Inventory Item is Talkative Parrot AND the Sold By field contains “Fred”:

=if List.AllTrue(
{[Inventory Item]="Talkative Parrot",[Sold By]="Fred"}
then "Meets Criteria!"
else "No Match"

A Quick Note

This topic is contained in the forthcoming M is for Data Monkey book.  And it’s not too late to get YOUR name inside!

Update: Refresh Power Queries With VBA

Some time back I posted a routine to refresh Power Queries with VBA, allowing you to refresh all Power Queries in a workbook.

Things Changing…

The challenge with that specific macro is that it refers to the connection name, which may not always start with Power Query.  (As we move closer to Excel 2016, we are now aware that these queries will NOT have names starting with Power Query.)

I have seen an approach where someone modified the code I provided earlier to include a variable, allowing you to easily change the prefixed text from Power Query to something else.  While that works, it’s still not ideal, as names can be volatile.  It makes more sense to try and find a method that is cannot be broken by a simple name change.

Refresh Power Queries With VBA

The macro below is a more targeted approach that checks the source of the query.  If it is a connection that is built against Power Query engine (Microsoft.Mashup), then it will refresh the connection, otherwise it will ignore it.

Public Sub UpdatePowerQueriesOnly()
Dim lTest As Long, cn As WorkbookConnection
On Error Resume Next
For Each cn In ThisWorkbook.Connections
lTest = InStr(1, cn.OLEDBConnection.Connection, "Provider=Microsoft.Mashup.OleDb.1")
If Err.Number <> 0 Then
Exit For
End If
If lTest > 0 Then cn.Refresh
Next cn
End Sub

This macro works with Excel 2010, 2013 and 2016, and should survive longer than the original version I posted.

Remove All Rows Up To A Specific Value

A couple of weeks ago, Rudi asked how you would go about setting up a query to remove all rows up to a specific value.  Specifically, his question was this:

The other day I was asked if Power Query could delete all top rows up to a found value. I could not find a solution and its been a burning question till now.
For example: If I import a csv file containing columnar info, but the headings for the list are in different rows for each import. I know that the first heading in column A is called "ID Number", but each import has this heading in a different row.
How do I determine an applied step to delete all rows above "ID Number". I cannot use the delete top rows as its not always 5 rows, some import the headings start in row 10, others in row 3...but the label I am looking for is always "ID Number".

While the question was answered in the initial post, I still though it would be interesting to do a full post on this for others who might need to create similar functionality.

The Data Set

I didn’t have Rudi’s exact data, so I knocked up a little sample with an ID Number and Amount column starting in row 3, which you can download here.

The components I was after here was the ID Number header and an extra column of some kind.  In addition, I wanted to have some garbage data above, as I didn’t want to give the impression we can just filter out blank rows.

Now, I assume this data is loaded from an external file, but it doesn’t really matter, I’ll just load this from a range, as it’s just a data source.  The key is that the header row is not the first row.  So I defined a new range to cover the data

  • Select A1:B7 –> Name Box –> Data

I can then select the name from the Name box to select my data:


I then loaded it into Power Query by creating a new query –> From Table

Determine the Header Row

This is the first job.  In order to remove any rows above the header row, we need to know which row the header resides.  To do that, we need to add an index column, and filter to the specific header that we’re after.

  • Add Column –> Index Column –> From 0
  • Filter Column1 –> only select ID Number

This results in a single row of data, and conveniently has the Index number of the ID Number row listed.  In this case it’s 2.

Call up the Original Table

We’ll park that value for a moment, and call up the original table.  To do that, we click on the fx button beside the formula bar to create a new step, then replace the formula with =Source.


Remove All Rows Up To A Specific Value

Now comes the tricky part.

We cant’ remove all rows up to a specific value immediately.  We need to insert a step that removes the top 2 rows first, then modify it.  So let’s do that:

Home –> Remove Rows –> Remove Top Rows –> 2

This gives us the following table and formula:


The key is to understand this formula first.

  • The Table.Skip() function removes (actually skips importing) the first x rows
  • Custom1 is the name of our previous step
  • 2 is the number of rows

So what we really need to get is the number of rows.  We can extract that from the Filtered Rows step like this:

#”Filtered Rows”[Index]{0}


  • #”Filtered Rows” is the name of the step in the Applied Steps window
  • [Index] is the column we want to look at in that step
  • {0} indicates the first row of that step (since Power Query starts counting at 0

So let’s modify the function to:

= Table.Skip(Custom1,#"Filtered Rows"[Index]{0})

As you can see, it works nicely:


In fact, we can go even better than this.  Why are we referring to Custom1 at all?  Doesn’t that just refer to Source anyway?  Let’s also replace Custom1 with Source, and remove the Custom1 step from our query:



We can now promote our header rows, remove the unnecessary Changed Type step and set our data types correctly:



If you try inserting new rows at the top of the data range, then refreshing the completed query… it just works!  The output table will never end up with extra rows at the top, as we’re filtering for to start at the ID Number row.


Using M

If you learn to write your own M code from scratch, you can start combining lines as well, skipping the next to explicitly filter the primary table.  Bill S provided code in answer to Rudi’s question (adapted to this workbook), which is shown below:

Source = Excel.CurrentWorkbook(){[Name="Data"]}[Content],
AddIndex = Table.AddIndexColumn(Source, "Index", 0, 1),
Skip = Table.Skip(Source, Table.SelectRows(AddIndex, each ([Column1] = "ID Number")){0}[Index]),
PromoteHeaders = Table.PromoteHeaders(Skip)

It essentially replicates the same steps as the query I built via the user interface, only he combined the Filtered Rows step into the Row Removal step.  To bonus here is that the code is a bit shorter.  The drawback is that it might not be quite so “de-buggable” to someone who isn’t as experienced with M.

Multiplying NULL values in Power Query

I was working through a scenario today and came up against something unexpected when multiplying NULL values in Power Query.


I have a fairly simple table of transactions that looks like this:


And wanted to turn it into this:


Seems simple enough, but I ran into an odd problem.


Getting started was pretty easy:

  • Pull the data into Power Query using From Table
  • Remove the final column
  • Select the Bank Fee column –> Transform –> Standard –> Multiply –> –1

So far everything is good:


Then I tried to do the same to the Discount column.

Multiplying NULL values

At this point, something odd happened.  I did the same thing:

  • Select the Discount column –> Transform –> Standard –> Multiply –> –1

But instead of getting a NULL or 0 for John’s record, it gave me –1.  Huh?


This is honestly the last result I expected.  How can a NULL (or empty) cell be equivalent to 1?  I think I’d rather have an error than this.

Regardless, easy enough to fix, I just inserted a step before the multiplication step to replace null with 0:


Good stuff, so now just finish it off:

  • Right click the Customer column –> UnPivot Other columns

And all looks good…


… until I load it into the Excel table:


Seriously?  Negative zero?

To be honest, if I’m feeding a PivotTable with this anyway, I really don’t need the discount record for John.  To fix this I just went back to the Power Query and inserted another step right before the Unpivot portion when replaced 0 with null.  The result is now really what I was originally after:


End Thoughts

I can’t help but think that this behaviour has changed, as I actually tripped upon it refreshing a previous solution.  The killer is that the data has changed, but I’m pretty sure the old data set had these issues in it too, and it wasn’t a problem.

Regardless, I”m a little curious as to your opinions.  Is this expected behaviour?  Bug or feature?

Request new features for Excel

Have you ever wanted to request new features for Excel?  Maybe an upgrade of data validation, a new function, or something else?  Well guess what… there is a way…

In a recent blog post on the Office Blogs called “Your top 10 questions about Office 2016 for Mac answered”, answer number 7 was to the question “How can I request a new feature?”

And the cool thing?  This isn’t just applicable to Mac Excel.  It’s applicable to ANY AND EVERY flavour of Excel (Windows, Android, iPad, Mac, Online)

How to request new features for Excel

Go to the Excel Suggestion Box site at

  • Create a user account if you want to add a new suggestion or vote on existing suggestions.
  • Select one of the feedback forums listed.
  • Check out the ideas others have suggested and vote on your favourites.

The Excel team does suggest that you search for your suggestion among existing requests before you post a new one.  This is actually important as you can then vote someone else’s suggestion up.  The more votes, the more likely it has a chance of being seriously looked at.

So on that note, check out the ones that are there.  Vote them up and comment if they are in line with what you’re thinking.  But add your own if you can’t already find one in place.

How to request features for Word, PowerPoint, Outlook and OneNote

As it would happen, they all have their own sites.  You can find them at the links below: