Keep Only Numbers in Power Query

My last blog post was interesting in that I got a few emails about it.  Both Imke Feldman and Bill Szysz sent me better methods, and a blog commenter asked for a slightly different version.  For this post, I’m going to adapt Imke’s technique to show how we can Keep Only Numbers from a string of text (removing all other characters.)

Other Posts on the Subject

Each of these posts will be a targeted to a specific scenario with it’s own idiosyncrasies.  Which you need depends on your specific situation, of course

  • My original post to split off measurements leaving only the numbers (this will only work if there are no numbers in the measurement.)
  • The method in this post (which will remove all numbers – or text – in the input)
  • Bill Sysyz’s method to split off measurements (coming in a future post, but better than my original as it doesn’t break when measurements also include numbers)

In this Post:

In this post, we are going to keep only numbers in our data set.  Actually, we’ll also keep spaces and decimals the first time around, but we could easily modify the function to clear those too.  So for our first go, we’ll convert the data in the left column below, to show as displayed in the right column:


Of course, I started by just pulling the data into Power Query via the From Table command.

How to Keep Only Numbers

Looking at this from a logic point of view, what we want to accomplish is to remove any character that is not a number.  Ideally, we would like to use a function like this in a custom column in order to do so:

=Text.Remove(text as nullable text, removeChars as any)

The first parameter should be pretty easy, we could just feed in the [Quantity] column, but how would we provide all the characters to the last parameter?

Here’s the cool part… removeChars is an “any” datatype… that means we’re not restricted to a single character, we can actually provide a list.  So all we need to do is find a way to create a list of the characters to remove.

This is where Imke’s email to me was really helpful.  She had a step similar to the following in her code:

CharsToRemove = List.Transform({33..45,47,58..126}, each Character.FromNumber(_))

So what does this do?  It actually creates a list of non-contiguous numbers (33-45, 47, 58-126), then transforms each value in the list into it’s alphanumeric equivalent.  A partial set of the results is shown here:


For reference, character 32 is a space, 46 is a period, and 49-57 are the values from 0 through 9 – facts that you can discover by changing the values inside the lists.

In order to use this, I just popped into the Advanced Editor, and pasted the line above right between the “let” and “Source=…” lines.  (Don’t forget to add a comma at the end.)  And with a nice list of values contained the the CharsToRemove step, we can now create the custom column from the Source step:

  • Add Columns –> Add Custom Column
    • Name:  Result
    • Formula:  =Text.Remove([Quantity],CharsToRemove)

And it loads up nicely:


Now, keep in mind here that the purposed of this is to strip all characters except the numbers.  In the case of things like m2 and m3 in this data set, we’re left with a the final value, but that is exactly what the query is designed to do.

The final M code for this solution is:

CharsToRemove = List.Transform({33..45,47,58..126}, each Character.FromNumber(_)),
Source = Excel.CurrentWorkbook(){[Name="RawData"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Quantity", type text}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "Result", each Text.Remove([Quantity],CharsToRemove))
#"Added Custom"

Keeping Only Numbers

What if we wanted to also remove any spaces and decimals?  Easy enough, just add those values to the original list in the CharsToRemove step as follows:

CharsToRemove = List.Transform({32,46,33..45,47,58..126}, each Character.FromNumber(_))

And the result:


Removing Numbers Only

Now let’s keep the text and remove the numeric characters from 0-9 only.  To do this we modify the original list values again:

CharsToRemove = List.Transform({48..57}, each Character.FromNumber(_))



And the result:


End Result

This is pretty neat.  Once we recongnize which character code represents each character, we can build a list of those characters to remove, and take care of them all in one shot.  To put it all together, here is a look at the different views all shown in one table:


You can also download the completed file here.

Last PowerQuery.Training Class of 2015!

There will be a regular blog post coming later this week, but we wanted to just throw out a quick heads up that we are currently accepting registrations for the last PowerQuery.Training class of 2015.

Registrations are open now for the class which begins on November 24, 2015.  This will be your last chance of 2015 to get an in depth training class on the best damn tool to hit Excel in 20 years.  (Sorry Power Pivot, but Power Query is going to reach more people overall.)

For more details on why you need to take this amazing live online workshop, check out the details here.

To register, you can follow this link (and click the Register button on the bottom right of the page.)

And don’t forget that when you register you get a free digital copy of our amazing new M is for Data Monkey book too.

Hope to see you there!

Separate Values and Text in Power Query

I recently received a comment on one of my blog posts asking how to separate values and text, especially when there is no common delimiter such as a space separating them.  This is a bit of an interesting one, as there is no obvious function to make this happen.


The scenario the user has here is a list of values with their unit of measure, similar to this:


This issue here is that we don’t really have anything to easily split this up, as there isn’t really a good pattern.  Sometimes there are spaces after the values, sometimes not.  The letters change, and there is non consistency to how many characters the values represent.  So how would you approach this?

You can download the sample workbook here.

My Route

I think that a solution for this type of problem is going to be specific to the data in use.  Looking at the sample data, I figured that I can probably bank on all the numbers being at the beginning of the string, and that I probably won’t see something like square meters expressed as m2.  Of course, if that assumption wasn’t correct, I’d have to come up with another method.

At any rate, the angle I approached this was to build a custom function to remove the leading numeric values.  That should leave me with the text values, which I could then replace in the original string.  Let’s take a look.

Removing Numbers

As we recommend in M is for Data Monkey, the way to build a custom function is to start with a regular query that will let us step through each piece you need to do.

So focussing on doing this through the user interface, here’s how I started this solution.

  • Create new Power Query –> From Other Sources –> Blank Query
  • In the formula bar, I typed in 1.07Kg  (no quotes, just that text and pressed Enter
  • I then right clicked the text in the Power Query window, and choose to convert it to a list


Of course, you can’t do a ton with Lists in the user interface, so I converted it to a table:

  • List Tools –> Transform –> To Table –> OK

To be fair, I could have started by creating a record or a list from scratch (as we show you how to do in M is for Data Monkey,) but I didn’t really need to here in order to get up and running quickly.  Regardless, I’m now sitting in a nice place where I have the entire UI exposed to do what I need (which was my original goal.)


At this point, things become pretty easy:

  • Right click Column1 –> Replace Values –> Replace 0 with nothing
  • Repeat for 1 through 9 and the decimal character

This removed all numbers and decimals, leaving me with just text.  But because I know some of the values had spaces in them as well, I should deal with that:

  • Right click Column1 –> Transform –> Trim


The final thing I did was to drill into the data point there, as I don’t really want to return a table when I convert this into a function.  To do that I needed to:

  • Click the fx on the left of the formula bar
  • Append the following to the text in the formula bar:  [Column1]{0}


Notice that we now have just the data point, not the Column1 header.

Converting the Query to a Function

Now, we’ve got a neat little function that will let me take a data point, sanitize it, and turn it into data point with no leading values.  But how can I repurpose that to use it for every record?  The answer is to turn this query into a custom function, as we describe in  Chapter 22 of M is for Data Monkey.  Here’s how we do it:

  • Go to View –> Advanced Editor
  • Right before the “let” line, add the following:

(Source) =>

  • Go and place two / characters in front of the current Source line in order to comment it out (otherwise it would overwrite the function input)

//Source = “1.07Kg”,

  • Click Done
  • Rename the query to fxRemoveNumbers

That’s it.  We’ve converted it to a function.  Now you can go to Home –> Close & Load to save it and it’s ready for use.  The interesting part here is that creating the logic is the hard part, converting it to a function is deadly easy.

Separate Values and Text

So now let’s use our new function to separate values and text.  Here’s how I did this:

  • Select any cell in the table –> create a new query –> From Table
  • Go to Add Column –> Add Custom column
    • New column name:  Measure
    • Column formula:  fxRemoveNumbers([Quantity])

And we’ve got a nice new column with just the textual values.


Not bad, now we just need to figure out a way to replace the matching text in the Quantity column with nothing…  After checking MSDN’s Power Query formula guide, I found a formula called Text.Replace() that seems it should do just that:

  • Go to Add Column –> Add Custom column
    • New column name:  Value
    • Column formula:  =Text.Replace([Quantity],[Measure],"")




To summarize here, we’re going to look at what is in the Quantity column and replace any instance of the text in the Measure column with the value between the two sets of quotes (i.e. nothing.)  The results are shown below:


Now it’s just a simple matter of doing some cleanup:

  • Right click the Value column –> Change Type –> Decimal Number
  • Right click the Quantity column –> Remove


And there you go.  It’s finished.  We simply need to go to Home –> Close & Load to commit it, and then refresh it any time we need it.

M is for Data Monkey

The book is now available and is packed with good information that will help you solve this issue as well as many others.  Learn more about the book here.

Suggestion to Improve the Pivot Table Experience

This is a special post to to discuss a suggestion to improve the Pivot Table experience, especially for Power Pivot users.

This week I’m at the 2015 MVP Summit in Redmond, WA.  It’s a trip I’m lucky enough to make every year, and certainly one of the annual events that I look forward to the most.  It’s a chance to reunite with my friends in the global community of Excel experts, as  well as make some new friends there too.  In addition, we get the opportunity to meet with the Microsoft Excel engineers, give our feedback, and talk about the things that are/aren’t working in the program.

Of course, this doesn’t mean that they can or will implement the suggestions we have.  Excel is a massive program, and every feature change can cause bigger issues elsewhere.  But they do listen, and they do want this product to be the best it can be.  Like every company, they have to work out what they can afford to do, and where the best investments are for their limit of resources.

In the spirit of the summit, I thought I’d share one of the ideas I have that I think would be really beneficial to Power Pivot users.  Maybe it makes the radar, maybe it doesn’t, but I think it would be a really useful change.  I’m fairly certain it could also be implemented without causing any issues with other features in the product as well.

The Issue

For those working with Power Pivot, you know the power of DAX.  This leads to creating many different DAX measures, each of which are landed in the columns of the Pivot Table.  This is awesome, but it brings up a challenge with the usability of the Pivot Table field list:


Back when we just dropped singular fields into the Values area, things weren’t so bad.  I generally only ran with a few fields, and I didn’t feel super constrained by the size of the window.  Yes, I overran the limit on occasion, but it wasn’t a big deal.

With Power Pivot, things have changed.  I have so much more flexibility to write the DAX measures I need, which leads to many more columns being defined.  If you think about things like forecasting an annual cash flow statement, I’ll write at least 13 different measures (one for each month), plus a total.  And that’s just one scenario.  For a regular financial statement the same thing… Actual, Budget, Variances, Year to date Actuals, Year to date Budgets, and so on.  Again, it’s not uncommon to see a statement with over 12 columns.

This proliferation of measures leads us to the issue… the Values are of the Pivot Table field list is too small today.  It only holds 3-4 visible columns at a time.  Trying to move a measure into the right place is a real pain, especially if you add a new measure to the bottom, and you have to drag it up.  I’m sure you’ve had massive “overscroll” problems where the thing seems to speed up to mach 5 JUST as you are trying to move it up that one last row…

The Slightly Better View

The Pivot Table field list has an alternate view called “Field Section and Areas Section Side-By-Side”.



This is a bit better, as we can at least see more fields in the area on the left.  But that’s only helpful for scrolling and finding the fields we need, not placing them on the Pivot:


You see?  I’ve still only got three rows showing (four when my Excel is maximized on screen.)

But here’s the thing…

When I’m building my Pivot, I rarely end up putting anything in the Filters area, as I tend to use Slicers.  I might have a few fields in there that I don’t want users messing with (I hide the top rows of the Pivot Table), but generally I’m looking at between zero and two fields in there.

And when I build my Rows and Columns, I tend to drag them on the Pivot and call it a day.  I could use more space on occasion when I’m layering on my Row fields, but Columns are usually sufficient.  Especially now that I’m writing DAX formulas.  The measure gets dragged in to the Values area, and doesn’t need anything in the Columns area at all.  It’s partly for this reason that the small size of the Values area is killing me.  The old logic for how the Pivot was build has essentially changed, with the description moving from the Columns area to the Values area.

What that means is that I’ve got a ton of wasted whitespace in my Filters and Columns area.  So why not reclaim that whitespace?

Suggestion to Improve the Pivot Table Experience

So here’s my suggestion to improve the Pivot Table experience: modify the “Field Section and Areas Section Side-By-Side” view as follows (excuse the rough mockup…)


The key changes here are really about the arrows to the right of the Filters, Rows, Columns and Values areas.  These are the same arrows as used in the Field List on the left, where the white arrow pointing to the right shows the area collapsed, and the black arrow shows the area expanded.

To be clear, the proportions aren’t correct here, but my thought is that the expanded areas consume an equal share of the remaining whitespace.  So if all four areas are expanded, they each get a 25% share of the remaining space, as it what we see in the current implementation.

But collapse one field (let’s say Filters), and each remaining area expands, as it now gets a 33% share of the remaining space.  Collapse two (as I’ve shown above), and the remaining two get 50% each.  Collapse three, and all remaining whitespace goes to the final area:


This would be fantastic, as it would let me build my Pivot much more easily.  I’d be able to see what I’m working with, especially on Pivot Tables with higher levels of Row or Values fields.

I didn’t scope this in, but it would also probably be a good idea to append a number in parenthesis to each area as well, indicating how many fields exist in each area.  So in this case: image

Naturally, when you’re first building a Pivot, it should open with all areas expanded to 25% of the share… but bonus points if there is a way to save the default view for a configured Pivot.  The reason that I say this is that my guess is that 75% of the time when I’m modifying a Pivot it’s the Values area I’m doing, 20% is Rows, 4% is Columns and the remaining 1% of the time I’m modifying Filters. Respecting that others have different uses though, the ability to choose which fields are expanded/collapsed by default on an already existing pivot would be incredible.

At any rate, that’s my idea.  Here’s hoping a program manager on the Excel team thinks there’s merit to it and starts to look at the feasibility.  Feel free to share your thoughts on the subject below.  :)

If you like this idea...

Please throw it some votes at Excel UserVoice.  The more votes it gets there, the more likely it will be implemented!

Merge Data Based on Two Columns

This past weekend I attended SQL Saturday in Portland, OR.  While I was there, I attended Reza Rad’s session on Advanced Data Transformations with Power Query.  During that session, Reza showed a cool trick to merge data based on two columns through the user interface… without concatenating the columns first.

The Issue

Assume for a second that we have data that looks like this:


There’s two tables, and we want to join the account name to the transaction.  The problem is that the unique key to join these two tables (which isn’t super obvious here) is a combination of the Acct and Dept fields.  (Elsewhere in the data the same account exists in multiple departments.

To get started, I created two connection only queries, one to each table.

  • Select a cell in the left table (Transactions) –> create a new query –> From Table –> Close & Load To… Connection only
  • Select a cell in the right table (COA) –> create a new query –> From Table –> Close & Load To… Connection only

My Original Approach

Now, with both of those created, I want to merge the data so I get the account name on each row of the Transactions table.  So how…?

Originally I would have edited each query, selected the Acct and Dept columns, and merged the two columns together, probably separating them with a custom delimiter.  (This can be done via the Merge command on the Transform or the Add Column tab.)

Essentially, by concatenating the columns, I end up with a single column that I can use to dictate the matches.

Reza’s presentation showed that this isn’t actually necessary, and I don’t need to merge those columns at all…

Merge Data Based on Two Columns

So here’s how we can get those records from the COA Table into the Transactions table:

  • Right click the Transactions query in the Workbook Queries pane
  • Choose Merge
  • Select the COA query

The data now looks like this, asking for us to select the column(s) we wish to use for the merge:


So here’s the secret:

  • Under Transactions, click the Acct column
  • Hold down the CTRL key
  • Click the Dept column

And Power Query indicates the order of the columns you selected.  It will essentially use this as a temporary concatenated value!


So now do the same to the COA table:


And then complete the merge.  As you can see, you get a new column of data in your query:


of course, we can expand NewColumn to get just the Name field, and everything is working perfectly!


End Thoughts

This is pretty cool, although not super discoverable.  The really nice piece here is that it can save you the work of creating extra columns if you only need them to merge your data.

I should also mention that Reza showed this trick in Power BI Desktop, not Excel.  But because it’s Power Query dealing with the data in both, it works in both.  How cool is that?

Breaking Power Query via Power Pivot is a thing of the past

I’m pleased to let people know that breaking Power Query via Power Pivot is a thing of the past … at least for users of Excel 2013 or higher.  (Sorry, if you’re on 2010, you still need to be careful.)

The information has been around for a bit, and it’s one of the topics we cover in our as well: how to break your Power Query by doing one of the following actions in Power Pivot:

  • Renaming a table
  • Renaming a column sourced from Power Query
  • Deleting a column sourced from Power Query

Any of these three actions would set your query into an un-editable state, but worse, nothing would appear to happen.  The query would refresh as normal, until you eventually tried to change it.  At that point all hell would break loose and your only option was to rebuild your query (and related data model table) from scratch.

This has been covered in detail in the following sources:

But now, breaking Power Query via Power Pivot is a thing of the past…

This issue was fixed in Excel 2016, but it left many of us hanging with an older version that still exhibited the problems.  If you’re on 2013, however, that problem has now been fixed.  I share the links at the bottom of the post to make sure you’re updated, but first I’ll demonstrate that the fix is really working.

To set the stage, I created a simple Calendar table in Power Pivot, and loaded it to the Data Model.

Corruption Method #1:  Deleting Columns

My first test was to attempt to delete the Year column in Power Pivot.  At first it looks like nothing has really changed:


But when I click Yes, Power Pivot comes back with a message to let me know that I can’t do it after all:


Hooray!  This is fantastic news, as it means that I can’t actually destroy my entire data model.  Beautiful!

Corruption Method #2:  Renaming Columns

Next I tried to rename the Year column to myYear.


Nope.  Can’t break the model that way either.

Corruption Method #3:  Renaming the Table

Finally, I tried to rename the table from Calendar to myCalendar:


And it looks like we’re protected from shooting our model in the foot too.

My thoughts on the fix

I’m 99% happy with this fix.  It protects us from accidentally blowing up our data models, which is super important.  Especially because it was possible to break the model and still run for months without every realizing it.  That just shouldn’t be allowed to happen.  So why am I not 100% happy?

Well, the first part is that Excel 2010 users are still susceptible to the issue.  That’s a challenge, although to be fair Microsoft has been pretty forthcoming that the Load to Data Model hack is not truly a supported method anyway.  So really, there’s not much of a surprise there.  I’m not holding any points back on this one.

The last part – the remaining 1% for me - is that the fix, as implemented, means that you cannot ever rename a table in Power Pivot that was source from Power Query.  In fact, even if you go back to Power Query and rename the table there, it still shows under the original name in Power Pivot.  Granted it’s not a total show stopper, but you do want to give some thought to your query naming before you push it into the data model that very first time.

How can you ensure you have the fix?

If you’re running automatic updates for Office 2013, you should already have the fix in place.  But if you want to check (or you don’t), then here’s the deal:

The full support KB article on the subject can be found here.

It will direct you to install the following updates:

  • KB3039800: update for Office 2013 – From October 13, 2015
  • KB3039739: update for Office 2013 – September 8, 2015
  • KB3085502:  MS15-099 security update for Excel 2013 – September 8, 2015

(There is a 32 and 64 bit version of each, so make sure you pick up the right version.)

For reference, I just tried to install them, without checking if they’d been installed first.  Fortunately it does a check first, so for me each of them came back with a message like this:


So there, you go.  Great news for users of Power Query and Power Pivot 2013 and higher.  You can now model with the confidence that you won’t accidentally blow up your solution!

MZ-Tools 8.0 for VBA

One of my favourite add-ins of all time just got an upgrade, and I’m super stoked about it.  Why?  Because I can use it again!

As I began my VBA journey, there were two add-ins that I used all the time:

Both were invaluable, with SmartIndenter allowing right click access to re-indent code, and MZ Tools providing a TON of useful content.  (My favourite was the error handling template I could just inject with a couple of clicks.)

It became painful to work on or debug VBA code on anyone’s PC who didn't’ have these tools installed, and the became part of the default installation routine for my machine.

Why I’ve been Add-in free for years

Unfortunately, both MZ Tools (3.0) and SmartIndenter were written in VB6, which meant that they were restricted to the 32 bit versions of Excel.  And that meant that the day I started using Power Pivot, I lost the ability to use either add-in.  (Okay, to be fair I could have stuck with 32 bit Excel for Power Pivot… except there was no way I was doing that.  The need for more memory accessed trumped the tools that made my VBA life easier.)

I’ve now been running without the aid of these tools for about 5 years… which is shocking… and STILL miss them.  A few times over the last few years, I even made some attempts to replicate some of these features on my own, but I could never figure out how to get VB.NET to hook into the VBIDE, so gave up on it.  Instead I focussed on tools I could control, building add-ins and software in other areas.  (It always irked me that I couldn’t figure out how to hook the VBIDE though!)

No longer Add-in Free

For that reason, I was pretty jazzed when Carlos Quintero emailed out to say that he’s updated and released not only MZ-Tools for Visual Studio, but also MZ-Tools 8.0 for VBA.  That is FANTASTIC.  I’ve downloaded it, got it installed, and am already digging through the loads of features to customize my templates.

Unfortunately I’m not such a good judge of what’s new in this version (my memory of it is five years out of date) but here’s some of the stuff that I’m looking forward to (re-)acquainting myself with:

  • Dead code review.  I’ve already scanned a couple of my add-ins and found unused variables and unused routines that can be trimmed.
  • Statistics.  Kind of a vanity thing, maybe, but I’ve always wondered how many lines of code are actually in my XLGFileTools add-in.  As of today, the answer is 6,726.  (Maybe a couple less once I review the Dead Code report above)
  • Code templates:  I can’t wait to rebuild the error handling template.  I also remember in the past the ability to insert a comment block at the top of each routine/module very easily for documentation too.
  • The simple thing of being able to right click the Immediate window and choose Clear.  Oh my how I’ve missed you!

These are just some highlights, there are obviously tons more.

Worth the cost

if you look back you’ll see I don’t endorse many products, and certainly not as passionately as I am here.

The goal of MZ-Tools is to make your everyday programming life easier.  I 100% believe that it does that, and that it is worth the cost to purchase it – something I don’t say very often!  (Understand I’m not making any commission or advertising revenues off this, either.)  The software is just that good and useful.

But even better, if you are in the market for it, Carlos has a 50% sale on through the end of October.  That will save you $40 off the regular $79.95 price tag.  How can you beat that?

You can find it at 

Happy coding!

Split by line breaks in Power Query

Some more savvy Excel users know that you can break text onto multiple lines in a cell by pressing Alt+Enter mid entry.  Today’s post explores how we can split by line breaks in order to break these types of cell contents into multiple columns.

Set up the data

To start with, let’s set up some simple data:

  • In cell A2, type “Text” and press Enter
  • In cell A3 type “This” –> Alt + Enter –> “is” –> Alt + Enter –> “text” –> Enter

The result should look like this:


And now we’ll go and pull it in to Power Query:

  • Select the data –> create new query –> From Table

Split by Line Breaks

At this point, you’d certainly be forgiven for thinking that only the first line was pulled in.  But if you select the cell, you’ll see in the preview window that all the data is there:


So let’s try and split it up.

  • Right click the Text column –> Split Column –> By Delimiter

Unfortunately, there is no line break or carriage return option in the dialog, which means that you’ll need to pick “Custom”, and enter the special character for a Line Feed:


Even worse, with entering this, Power Query is overly aggressive when you click OK.  It assumes that this is special text, so escapes it to text, and appends some commands that actually mess you up:


Notice how we have two columns with nothing in the second.  What gives there?

To correct this code, we need to modify the formula in the formula bar to do two things:

  1. Undo the escaping that Power Query did on our #(lf) entry, and
  2. Remove the code that is telling which columns to import

So first, we need to replace:




And second, we need to remove this completely:

,{"Text.1", "Text.2"}

And the results are much better:


The Good/Bad News

The bad news is that currently it’s a bit painful to do this.  The good news is that it can be done, and the better news is that Power Query is constantly being updated.  I’m sure it won’t be long before they give us an easier to use/more discoverable mechanism to make this work.

Other Special Characters

Should you need them, here are three special characters that you can refer to in Power Query:

  • Line feed: #(lf)
  • Carriage return: #(cr)
  • Tab: #(tab)

Clean WhiteSpace in PowerQuery

The other day as I was working through a model, I once again tripped upon the fact that Power Query’s Text.Trim function doesn’t clean whitespace inside the text string, only at the ends.  For those who are used to Excel’s TRIM function, this is a frustrating inconsistency.

Just to circle on it, here’s the difference:

Source Function Result
Excel =TRIM(“  trim   me  “) “trim me”
Power Query =Text.Trim(“  trim   me  “) “trim   me“

Typically, I’ve just gone through the cycle of replacing a double space with a single space a few times on the same column to deal with this issue.  The issue, of course, is that you need to do this twice if there are 4 spaces, but add more spaces, and you have to do this more times.  Doesn’t seem like a really robust solution.

At any rate, this time I emailed one of my friends on the Power Query team and suggested that they should implement a function to make this a bit easier.

My Suggestion for a Clean Whitespace Function

The gist of my suggestion was to create a new function that would not only trim the whitespace internally, but would also allow you to specify which character you want to clear out.  This way it would work nicely to clean whitespace in the shape of spaces (the usual culprit in my world), but would also allow you to substitute in other characters if needed.  (Maybe you need to replace all instances of repeating 0’s with a single 0.)

It got referred to another friend on the team, (who wishes to remain nameless,) and he responded with some pretty cool code.  I’ve taken that code, broken it down and modified it a bit, and the end result is a slightly different version that can work the same as Excel’s TRIM() function, but adds an optional parameter to make it even more robust.  For lack of a better name, I’m going to call it “PowerTrim”.  (Just trying to do my part to keep the Power in Power Query!) 😉

Here’s the function:

(text as text, optional char_to_trim as text) =>
char = if char_to_trim = null then " " else char_to_trim,
split = Text.Split(text, char),
removeblanks = List.Select(split, each _ <> ""),
result=Text.Combine(removeblanks, char)

And to implement it, you’d take the following steps:

  • Copy the code above
  • Create a new query –> From Other Sources –> Blank Query
  • Change the query name to PowerTrim
  • Go into the Advanced Editor
  • Select all the text and replace it with the code above –> Done

Like this:


How it Works

We’d call this from a custom column, feeding in a column of text, and specifying the character (or even string of characters) we’d like to trim.  The function then works through the following process:

  • It checks to see if the char_to_trim was provided, and uses a space if not
  • It splits the text by that character, resulting in a list:


(This list shows the word “bookkeeper” split by “e”)

It then:

  • Filters out any blank rows
  • Combines the remaining items using the original character to split by

(The original version was actually all rolled up in one line, but I find it easier to debug, step through, examine and play with when it’s separated.)


Here’s some examples of the function in action. I started with a raw table from Excel.  (Create a new query –> From Table)


And added a Custom column by going to Add Column –> Add Custom Column

  • Name:  Trim_null
  • Formula:  =PowerTrim([Text])


Notice that in the first row it trimmed the leading, trailing and internal spaces.  Just like Excel!  (Remember that if you used Power Query’s default Text.Trim() function, you would return “trim   me”, not “trim me”.)

Now, let’s add another and try with an alternate character… like 0.  Again, we go to Add Column –> Add Custom Column:

  • Name:  Trim_0
  • Formula:  =PowerTrim([Text],”0”)


In this case the extraneous zeroes are trimmed out of row 3, leaving only a single one.  Cool stuff.  Now what about the “e”. Let’s see how that one goes.

Once more to Add Column –> Add Custom Column:

  • Name:  Trim_0
  • Formula:  =PowerTrim([Text],”e”)


The first time I looked at this, I thought there was an issue with the function.  But then I remembered in this case we are removing all leading and trailing e’s, as well as replacing any duplicate e’s with a single e.  You can see that this is indeed what happened in both rows 2 and 4.

Final Thoughts

I wish there was a way to get this to easily role into the Text functions category, so that I could call it something like Text.PowerTrim() or even replace the Text.Trim() function with my own.  Unfortunately a query name can’t contain the period character, which kind of sucks.  I guess it’s to to protect you from accidentally overwriting a function, but I’d like the ability to do it intentionally.

Allocate Units Based on Dates Using Power Query

I ran into an interesting wrinkle in a model I’m building, where I need to allocate units based on dates.  The idea here is to allow a user to the number of units to allocate, the start date and the end date.  From there, I wanted to use Power Query to work out how many months have elapsed, and then tell me how many units should be allocated to each year in the period.


Here’s a look at my data (which you can download here):


So the idea here is that I need to come up with a table that shows that data should be allocated as follows:

10-1-2015 8-53-05 AM

So, if we look at the Traditional Single Family, the sales cycle is the 6 months from Aug 2015 through Jan 2016.  With the first 5 months being in 2015 and the final month being in 2016, that means we need to allocate 5/6 of the total units to 2015 and 1/6 to 2016.

Allocate Units Based on Date: Method

My initial thought was to try and find a date difference or duration type function to return a count of months between two dates.  Unfortunately, such a function doesn’t seem to exist.  For that reason, I decided I’d just go ahead and build my own function to do the job.

Step 1: Create a function to return a list of months

To start with, I needed a list of month end dates.  I started a blank query, jumped into the Advanced Editor and built a simple query to provide a hard coded startdate and enddate, then create a list from one to the other:

Source = {Number.From(startdate)..Number.From(enddate)}

That list yielded me a list of date serial numbers, so I then:

  • Went to Transform –> To Table
  • Changed the column’s data type to Date
  • Renamed the column to Date
  • Converted the column to month end dates (Transform –> Date –> Month End)
  • Removed Duplicates (Home –> Remove Duplicates)

The end result is a short table that shows only the month end dates:


Step 2:  Add a Year End date column

I then needed to find a way to count the number of months in each year.  To do that I:

  • Added a year end column (Select the Date column –> Add Column –> Date –> Year –> End of Year)
  • Went to Transform –> Group By and set up the grouping as follows:
    • Group by:  EndOfYear
    • New column name: Months_x_Year
    • Opeartion:  CountRows


Step 3: Modify to list Months in Period

At this point I realized that I had a pretty serious miss in my logic.  If I wanted to apply this as a proportion, I needed to also track the total amount of months in the period (so that I could allocate 5/6 to 2015 and 1/6 to 2016.)

To fix this, I added another level of grouping, but with a twist…

  • I removed the “Group By” column
  • I created an “Original” column, and set the operation to All Rows
  • I created a “Months_Total” column, set to SUM the Months_x_Year column

Here’s the configuration:


And the result:


This is pretty slick, as the grouping returned the total count of months, but also returned the original table.  Of course, when you expand the table using the double headed arrow to the top right of the Original column, it runs the Months_Total row down each row that gets added:


Step 4:  Turn the routine into a function

The next step was to go back into the Advanced Editor, and turn this into a function. That’s actually not hard at all, requiring only three lines to be modified.  The first 4 lines of the function are shown here:

(startdate as date, enddate as date) as table =>

As you can see, I basically added the parameter line at the beginning (using the same variable names for startdate and enddate), then commented out the lines I initially used in order to populate the data I used to build my test case.

Finally, I renamed the function to fnGetAllocationBase, and saved it.

Step 5:  Using the function

To use the function, we basically now just load the original table, then feed the start/end dates in to it.  Here’s how I went through that process:

  • Select the table –> Power Query –> From Table
  • Select the First Month and Last Month columns –> Change Type –> Date
  • Add Column –> Add Custom Column
    • Formula:  =fnGetAllocationBase([First Month],[Last Month])

I now had a new column containing the tables I needed with my allocation basis:


As I didn’t need month granularity for my model, (we’re budgeting on an annual basis,) I’m now able to:

  • Remove the First Month and Last Month columns
  • Expand the columns from the Custom column
  • Add a new custom column with the following details:
    • Name:  Units
    • Formula:  =[Units To Allocate]*[Months_x_Year]/[Months_Total]
  • Remove the Units to Allocate, Months_x_Year and Months_Total columns
  • Set my data types

And the end result is a nice table that will serve my sales model nicely: